## **Data for Social Good** Non-Profit Sector Data Projects

**Jane Farmer Anthony McCosker Kath Albury Amir Aryani**

Data for Social Good

## Jane Farmer • Anthony McCosker Kath Albury • Amir Aryani Data for Social Good

Non-Proft Sector Data Projects

Jane Farmer Social Innovation Research Institute Swinburne University of Technology Melbourne, VIC, Australia

Kath Albury Swinburne University of Technology School of Social Sciences, Media, Film & Education Melbourne, VIC, Australia

Anthony McCosker Social Innovation Research Institute Swinburne University of Technology Melbourne, VIC, Australia

Amir Aryani Social Data Analytics Lab Swinburne University of Technology Melbourne, VIC, Australia

ISBN 978-981-19-5553-2 ISBN 978-981-19-5554-9 (eBook) https://doi.org/10.1007/978-981-19-5554-9

© Te Author(s) 2023. Tis book is an open access publication.

**Open Access** Tis book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

Te images or other third party material in this book are included in the book's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Te use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specifc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

Te publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Te publisher remains neutral with regard to jurisdictional claims in published maps and institutional afliations.

Cover illustration: Pattern © Melisa Hasan

Tis Palgrave Macmillan imprint is published by the registered company Springer Nature Singapore Pte Ltd.

Te registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

## **Acknowledgements**

We pay our respects to the traditional custodians of the lands on which we work and acknowledge their Elders, past and present.

We would like to acknowledge funders of our research on data for social good, including the Australian Research Council (ARC) for Linkage Infrastructure Equipment and Facilities, grant no. LE200100074; Data Co-operative Platform for Social Impact; the ARC Centre of Excellence for Automated Decision Making and Society, grant no. CE200100005; ARC Discovery Project, grant no. DP200100419; Victoria State Government; Australian Red Cross; Lord Mayor's Charitable Foundation; and City of Greater Bendigo Data Co-op partners.

We would like to acknowledge the contribution of the Regional Innovation Data Lab (RIDL) at Grifth University, Queensland, Australia, and the Visualisation and Decision Analytics (VIDEA) Lab at the University of Canberra, Australia.

## **Contents**





## **About the Authors**

**Kath Albury** is an Australian Research Council Future Fellow (2022–2025) in the Department of Media and Communication at Swinburne University of Technology. She is an Associate Investigator in the Australian Research Council Centre of Excellence for Automated Decision Making and Society, and a programme leader in the Social Innovation Research Institute. Kath's research investigates the intersections of digital technologies and platforms, digital literacy, data capabilities and sexual health and wellbeing. She is a co-author of *Everyday Data Cultures* (2022).

**Amir Aryani** leads the Social Data Analytics (SoDA) Lab at Swinburne University of Technology. Te lab applies data analytics techniques for insights into health and social challenges. His expertise is in data modelling, information retrieval techniques and real-time data analysis. Amir has partnered on projects with the British Library, ORCID (US), Netherlands Data Archiving and Network Analysis (DANS) and German Institution for the Social Sciences in Germany (GESIS). His funding sources include the Australian Research Council, the Australian National Health and Medical Research Council, and the US National Institutes of Health. He has published in journals including *Nature Scientifc Data* and *Frontiers in Artifcial Intelligence and Applications*.

**Jane Farmer** is Director of the Social Innovation Research Institute at Swinburne University of Technology, Melbourne, Australia. Her background is as a researcher in rural health service and workforce innovation, community engagement and social enterprise. She has a keen interest in academic-practice research partnerships, innovative research methods, transdisciplinary studies and translating research into practice. Her other books include *Social Enterprise, Health and Wellbeing* (2021), *Remote and Rural Dementia Care* (2020) and *Community Co-production* (2012).

**Anthony McCosker** is Deputy Director of the Social Innovation Research Institute and is a Chief Investigator and Swinburne Lead for the Australian Research Council Centre of Excellence for Automated Decision Making and Society. His research addresses digital inclusion, participation and inequality and explores the impact of new communication technologies, particularly in relation to health and wellbeing and social inclusion. Current research addresses the social issues related to automation and machine vision technologies and media and the need for community-led approaches to data and analytics. He is author or coauthor of numerous articles and books, including *Everyday Data Cultures* (2020), *Automating Vision: Te Social Impact of the New Camera Consciousness* (2020) and *Negotiating Digital Citizenship* (2016).

## **List of Figures**


#### **xiv List of Figures**


## **List of Tables**


# **1**

## **Introduction**

In February 2020, just pre-COVID, a group of managers from community organisations met with us researchers about data for social good. "We want to collaborate with data," said one CEO. "We want to fnd the big community challenges, work together to fx them and monitor the change we make over ten years." Te managers created a small, pooled fund and, through the 2020–2021 COVID lockdowns, used Zoom to workshop. Together we identifed organisations' datasets, probed their strengths and weaknesses, and found ways to share and visualise data. Tere were early frustrations about what data was available, its 'granularity' and whether new insights about the community could be found, but about half-way through the project, there was a tipping point, and something changed. While still focused on discovery from visualisations comparing their data by suburb, the group started to talk about other benefts. Trough drawing in staf from across their organisations, they saw how the work of departments could be integrated by using data, and they developed new confdence in using analytics techniques. Together, the organisations developed an understanding of each other's missions and services, while developing new relationships, trust and awareness of the possibilities of collaborating to address community needs. Managers completed the pilot having codesigned an interactive Community Resilience Dashboard, which enabled them to visualise their own organisations' data and open public data to reveal new landscapes about community fnancial wellbeing and social determinants of health. Tey agreed they also had so much more: a collective data-capable partnership, internally and across organisations, with new potential to achieve community social justice driven by data.

We use this story to signify how right now is a special—indeed critical—time for non-proft organisations and communities to build their capability to work with data. Certainly, in high-income countries, there is pressure on non-profts to operate like commercial businesses—prioritising efciency and using data about their outputs and impacts to compete for funding. However, beyond the immediate operational horizon, non-profts can use data analytics techniques to drive community social justice and potentially impact on the institutional capability of the whole social welfare sector. Non-profts generate a lot of data but innovating with technology is not a traditional competence, and it demands infrastructure investment and specialist workforce. Given their meagre access to funding, this book examines how non-profts of diferent types and sizes can use data for social good and fnd a path to data capability. Te aim is to inspire and give practical examples of how non-profts can make data useful. While there is an emerging range of novel data for social good cases around the world, the case studies featured in this book exemplify *our* research and developing thinking in experimental data projects with diverse non-profts that harnessed various types of data. We outline a way to gain data capability through collaborating internally across departments and with other external non-profts and skilled data analytics partners. We term this way of working *collaborative data action*.

By 'data for social good', we mean using contemporary data analytics techniques to fulfl a social mission or to address a social challenge. Data analytics is understood as the process of examining data to fnd patterns and insights that can aid decision-making and ofer courses of action (Picciano, 2012). We defne non-profts as all those organisations and community groups operating to pursue a social mission and that do not operate to make a proft. Individual non-proft organisations are thought of here as each pursuing their defned social mission, but also contributing to a collective social mission of achieving a more equitable and just society. While non-profts are often using data to track their operations and aid reporting, we emphasise the data that non-profts *could use* to further their work and goals. Tis includes mainly:


We take a pragmatic stance here as we write at a specifc point in time and from our home country context (Australia), which we acknowledge is a high-income country with neoliberal ideology infuencing social policy. Non-proft data analytics is a fast-moving feld where practices and legislation will change. Other countries and regions have their own nuances. Globally, the non-proft sector is on a journey with data collection and computational data analytics. Tis is infuenced by policy that drives competition and demand for accountability and measurement, as well as a desire to use sophisticated techniques for social good. Tis journey will continue into the future.

Tis moment feels like a critical juncture for non-profts and data analytics. Current strategies and decisions taken within the sector will signifcantly infuence both the nature of non-proft data analytics and the philosophy underpinning it, but perhaps most crucially, it will infuence who has the capability to work with data and to what ends—towards what understanding of social beneft. We believe that non-profts need to have data capability to shape the future of the sector and afect the diference non-profts can make in the world. Te sector can be knowledgeable, confdent and advocate for suitable data practices, or—lacking capability—be forced to passively accept data practices determined by other powerful actors like government and 'Big Tech'.

Tis book is meant for non-proft leaders, managers, practitioners and board members who want to see what can be done with data and discover how organisations like theirs can become capable with data. It is also for researchers, as we show how partnering with non-profts can help us to contribute to social justice and to knowledge about data for social good. Te book is deliberately targeted at the practice and researcher nexus.

Tis frst chapter sets the scene by introducing concepts, challenges and our rationale for why non-profts should engage with data analytics. It is by no means comprehensive in its understanding of international data initiatives in the non-proft sector, especially not in relation to data law and guidance in diferent country contexts. For that, we recommend seeking out local expertise, as that area is subject to variation by country or region, and subject to change as practice is only forming.

## **The Non-Proft Sector and Data**

Te non-proft sector comprises organisations with diferent legal and operational structures, including charities, philanthropic foundations, voluntary and community organisations, community groups, social enterprises and co-operatives (Salamon & Sokolowski, 2018). Some nonprofts generate proft but re-invest it for social purpose. Te sector has diferent names internationally, including the charitable and non-proft sector (Canada); third sector, social economy, voluntary sector (UK); third and social economy (Europe); not-for-proft sector, community sector (Australia and New Zealand); and charitable, voluntary and philanthropic organisations, civil society (US) (Lalande, 2018; Productivity Commission, 2010; Salamon & Sokolowski, 2018). Non-governmental organisations (NGOs) are non-profts that tend to work in other country contexts (Vaughan & Arsneault, 2013).

While non-profts generally operate to address social purposes not suitably addressed by government or private organisations (Vaughan & Arsneault, 2013), the social welfare role of non-profts can vary even within countries. Indigenous cultures including the Maori of Aotearoa (New Zealand), for example, have diferent understandings of social and community life that infuence what is considered acceptable work for community organisations. Western notions of volunteering, separation of family and community, and who should provide community services should not be regarded as automatically aligned with Indigenous Peoples' cultural understandings (Tennant et al., 2006).

In high-income countries, non-profts are signifcant providers of community services, including health, mental health, social care, education, environmental protection and disaster relief programmes. Tey contribute signifcantly to national economies; for example, employing around 13% of Europe's workforce (Salamon & Sokolowski, 2018). Charities alone employ one in ten workers in Australia (Social Ventures Australia and Centre for Social Impact, 2021). Beyond service provision, nonprofts contribute to generating a sense of community, "giving expression to a host of interests and values—whether religious, ethnic, social, cultural, racial, professional or gender-related" (Salamon & Sokolowski, 2018, p. 56) and, importantly, act as social policy advocates (Salamon, 2014). As such, non-profts are key actors in the policy community. Tey infuence what are recognised as societal challenges, provide evidence about fruitful solutions and infuence how the work of their sector is done (Vaughan & Arsneault, 2013). Government is a major funder for non-profts in high-income countries via contracts to provide welfare services (Salamon & Sokolowski, 2018). Tis increasingly leads to governments dictating the terms of engagement. Consequently, it is imperative that the non-proft sector is capable in contemporary organisational practices and innovations so it can infuence social policy through datasupported knowledge and ideas.

In countries where policy is imbued with neoliberal ideology, including the UK, Australia and New Zealand, increased provision of public welfare services by non-profts started in the 1980s–1990s (Tennant et al., 2006). During this time, many traditional voluntary organisations became non-proft businesses. Additionally, the trend of non-profts supplying welfare services accelerated following the 2008 Global Financial Crisis. Te marketisation of the non-proft sector led to competition for funding between organisations, forcing increasing corporatisation. Some now refer to a *not-for-proft industrial complex* (*Incite! Women of Color Against Violence,* 2017), with concerns non-profts are forced to subordinate their social mission to respond to funder-determined priorities in order to survive.

Accountability and reporting demands of government and philanthropic funders mean non-profts have had to collect increasing quantities of data. Funders infuence or defne the data to be collected and may even supply data collection systems. Tis scenario can stife non-profts' internal strategies about working with data and funnel their work towards reporting rather than using data to drive social change. To date, the sector is accused of over-emphasising easy-to-collect output data (e.g., about number of services delivered) rather than data about outcomes, impacts and the processes underpinning them (Lalande & Cave, 2017). Over time, as non-profts look for new ways to gain competitive advantage, interest in innovative data use has grown. Some larger non-profts invest in data professionals, while others contract with specialist consultants.

Te danger with outsourcing data-related work is that organisational data and analytics become viewed as 'too hard' and internal know-how diminishes. We propose non-profts need to have data capability so they can appropriately drive their organisations' data strategy for impact. More widely, collectively developing data capability at a sector level enables non-profts to infuence government and funder priorities and investments around social challenges *and* data practices, informed by grassroots experiences. Here, we understand *non-proft data capability* as a holistic concept that involves interconnected combinations of resources. Data capability is hard to pin down to a checklist or benchmarking tool. It involves having the staf skills and roles, technologies, data management practices and processes that are appropriate for each non-proft in relation to its context of practice and enables efective use of data within that context. Tus, data capability for a non-proft is likely to evolve, potentially in response to changing organisation priorities, learning from trying out techniques and datasets, and in response to emergent data practices and norms of the non-proft feld. Non-proft data capability has foundations in responsible data governance. We suggest it can be built through collaborating, experimenting and discovering with data. We extend our discussion about non-proft data capability and how to achieve it in Chap. 3.

Unfortunately, as related to business operations rather than direct service provision, data and information management tends to be underfunded in non-profts (Social Ventures Australia and Centre for Social Impact, 2021; Tripp et al., 2020). Ongoing lack of investment and expertise in social data analytics leads to problems with adopting innovation, resulting in a phenomenon termed the *non-proft starvation cycle* (Gregory & Howard, 2009). Tis is where ongoing focus on funding service delivery leaves organisations simultaneously under-invested in management and infrastructure, but also in staf skilled to understand what is required. Organisations are thus vulnerable to environmental shocks, as seen in reactions to the recent COVID-19 pandemic. A survey of Australian charities' capability to deal with the pandemic found only 46% used cloud-based systems and only a third had systems and software for working at home. Defcits were mainly attributed to underfunding (Social Ventures Australia and Centre for Social Impact, 2021). A survey and report by Australian technology non-proft Infoxchange shows that the sector has not yet prepared for advanced data analytics or for automated futures, although investment in information technology and digital infrastructure and systems is improving and the skilled workforce is expanding (Infoxchange, 2020).

Collaboration between non-profts would enable cost-sharing for infrastructure and skilled workforce, but competition in the sector is a barrier. Tis has led to suggestions that government should incentivise or facilitate collective working (Social Ventures Australia and Centre for Social Impact, 2021). Some successful collaborative models exist; for example, Collective Impact initiatives, where community organisations work together to identify, address and monitor change about a social challenge. LeChasseur (2016), for example, describes a *Collective Impact* initiative to improve lives of low-income mothers and their babies. In Collective Impact, collaborating with data facilitates measurement of community-level social change as well as helping to assess the contribution of individual organisations. Some non-profts are involved in initiatives funded by Social Impact Bonds, where private investment can be gained to fund projects to improve social outcomes, with outcome data required in order to access premiums (Arena et al., 2016; Sainty, 2019).

## **Making Good Use of Data**

Te main goal of non-profts using data analytics is to inform organisational learning so adaptations can be made to achieve better outcomes. A range of reasons for applying analytics techniques to data to advance social missions are outlined by Verhulst and Young (2017), including for situational awareness and impact evaluation. Once attracted by the prospect of generating such analyses, the issue for non-profts might turn to how to adapt existing datasets, departments and staf into a system capable of generating insights from data.

Data analytics for non-profts is not solely predicated on having access to technology and applying computational techniques. Rather, it builds on having a foundation of knowledge about using data in research and evaluation. In this way, as *the science of examining data*, data analytics involves considering the characteristics of data you have or can access; its provenance and how it was collected; its availability for diferent uses and who can access it in unprocessed or analysed versions; understanding the ethical concerns, the consent given and obtained when data was created; the quality and what is missing in the data; and who data refers to or was collected from, to understand any in-built biases and data's inclusivity. However, as well as drawing on traditional research and evaluation knowledge, data analytics also requires evolving thinking and skills as new forms of data and analytical techniques become available and new ethical principles and practices are developed in response (O'Neil & Schutt, 2013). Ultimately, *good* use of data includes careful attention to how it is generated, the widening range of data types that can be analysed, and the impact this may have on people's privacy and other rights (see Chap. 3).

Exemplifying how using new types of data requires 'old' and 'new' thinking, we used a dataset of anonymised discussions on a national online peer support forum to evaluate services for rural mental health (Farmer et al., 2020; Kamstra et al., in press). Analysis was applied to identify themes in a large qualitative dataset of posts. Moving beyond traditional approaches to service evaluation, using the forum discussions as a rich qualitative dataset meant frst agreeing on a rationale for the analysis conducted, and recognising the complexities inherent in the dataset as a sample. For instance, we had to address the potential for bias given that some people were over-represented in the data (i.e., posting far more often than others). With the focus on more isolated rural service users, we removed posts made by people living in large rural towns with hospitals to ensure only more isolated residents' experiences were included. Te data allowed us to access the geospatial locations of those using the online service, but when mapping quantities and themes of posts geospatially, we had to consider how to visualise the data at sufcient spatial scale and abstraction to remove any potential for identifcation. Tus, while computational techniques now allow analysis of much larger datasets, and new sources extend potential for social value extraction from data, many of the same basic research skills are required to intelligently conduct and interpret data analyses. Making good use of data involves navigating new possibilities, while translating traditional research skills to respond to new challenges.

Before progressing further, we now summarise the main types of external data sources and types of internal data content that we think nonprofts might work with. Figure 1.1 illustrates characteristics of data we have used in our projects. It is not intended to be comprehensive of all data sources and content that could be used (for additional ideas, consult other relevant taxonomies, e.g., Susha et al., 2017).

We divide the data that non-profts might use into two categories: *internal data content* (i.e., this indicates the broad types of dataset content generated by non-profts through their work) and *external data sources* where data with a range of characteristics may be accessed. In Fig. 1.1, we suggest non-profts' internal data content can be divided into two types: *operational data*, where data is generated for and through an existing business purpose, including data about stafng, clients, services and funds; and what we term *outcome data*, referring to data collected specifcally for assessing processes, outcomes or impacts of programmes. For the outcome data, what to collect is likely to be informed by a theory of change or programme logic showing links between non-profts' programmes, how they are delivered, what they achieve and the ultimate fulflment of social mission. Typically, outcomes data might be collected through surveys at intervals following provision of programmes. *External data sources* include all data that can be accessed external to the organisation and used, including open data generated by government statistical agencies and data made into open data by other organisations. An example we have used is the *Infoxchange AskIzzy Open Data Platform* (https:// opendara.askizzy.org.au/), which provides anonymised geospatial location-based data from searches for community services across Australia.

**Fig. 1.1** Taxonomy of data that non-profts might use

External data also includes government and other organisations' data that can be made available under certain conditions and for particular purposes. Such data may be accessible subject to risk assessment or research protocol (e.g., sensitive government-collected health or crime data). Our understanding of 'other organisations' data extends to data from *other* non-profts, private sector organisations, academia and community groups. External data could include internal (private) datasets where data is only available to be shared within a limited collaborative group. Tis data will be available to the group under specifc conditions through data sharing agreements as part of data sharing initiatives.

Data may be *quantitative*, for example, amount of time spent with clients, numbers of episodes of types of services delivered, distances travelled to deliver services and fnancial information; or *qualitative*, for example, discursive content of notes relating to clients, complaints and feedback, online forum post data. To be meaningful and relevant, analysis should also harness data that is *temporal*, for example, data capturing client needs and transactions on a daily or weekly basis over time and other forms of monitoring to enable longitudinal and even 'real time' analysis; and data that is *locational*, for example, giving a geospatial location of where services were provided or locations of clients and staf (Loukissas, 2019).

Having summarised types of data that a non-proft might use, a further issue is how they might think about sources of internal data for use in analytics. Trough our work, we observe two approaches to sourcing internal data that we term here the *new data* and *re-use data* perspectives. Te *new data perspective* tends to align with growth of the outcomes measurement movement (Lalande & Cave, 2017; Social Ventures Australia, 2021), where non-profts want to substantiate their social impact. Tis is generally handled by collecting *new data* about outcomes, impacts and processes. Where organisations initially tended to generate data through bespoke programme evaluations, more recently there is a trend to collect generic outcomes data using frameworks and data models. Using standard tools means non-profts can save efort in generating their own indicators and measures, plus a standard framework allows comparison and benchmarking across diferent organisations. Teoretically, funders will be able to discover which non-profts most successfully address a social challenge such as social inclusion, employment or crime prevention. Examples of these are generated by governments (e.g., the New South Wales Government Human Services Outcomes Framework, see https:// www.facs.nsw.gov.au/resources/human-services-outcomes-framework) and businesses or social enterprises (e.g., Australian Social Values Bank). Researchers have also developed frameworks, for example, the *Community Services Outcomes Tree* (Wilson et al., 2021) was designed to provide *"*a comprehensive outcomes framework to assist services to name and then measure their outcomes…[and]… a set of data collection questions so services can ask questions of service users and collect data" (p. 1).

While such frameworks might assist cash-strapped non-profts, they have potential downsides. Tey imply collecting yet more data and are potentially infexible to the nuanced interests and missions of individual non-profts. Adhering to them could drive isomorphism where programmes tend to become increasingly alike as driven by addressing a standard set of performance measures. Tis could hinder innovation and lead to neglecting nuanced needs of diferent clients and consumers. Pif (2021) highlights that non-profts could waste valuable time trying to fnd the perfect framework and re-orienting their data collection to meet its new requirements.

Advocacy for the *data re-use perspective* comes from policy institutes, researchers and others that are interested in combining digital social innovation with growing community and civil society data capability (Dawson McGuinness & Schank, 2021). Analysing re-used data is something of a frontier space where data scientists may partner with social scientists, lawyers, community practitioners and citizens to formulate practices that are ethical and obtain added social value from data already collected (Williams, 2020). New rules, standards, models and tools are often emergent from practical data analytics 'discovery' projects and collaborations (van Zoonen, 2020). An example of generating novel transferable tools comes from our projects with non-profts (see Chap. 2) where data protocols and data-sharing agreements were formulated through iterative discussions with data scientists, practitioners at nonprofts and lawyers, where necessary.

Ultimately, of course, data must have been collected in order for it to be re-used and so the new data perspective also could generate data with potential for added value from re-use. Sometimes there may be a need for new data, but given a lot of data is already collected and exists, we advocate for optimising data re-use (where ethical and feasible) and minimising collection of new data.

As mentioned above, non-profts might work with others (non-profts and other entities) and share or pool data for richer insights and to drive collective working. Sharing data in multi-organisation collaborations is notoriously challenging (Verhulst, 2021). Understanding the extent to which data can be re-used and for what purposes, including sharing across collaborations, involves knowing why and how data was collected originally—and crucially—the details of consent obtained from those contributing to data generation (Verhulst, 2021). In the case of nonprofts with their propensity to collect personal data, it often involves knowing about the nature of consent from clients, citizens and staf. Issues around consent for re-using and sharing data are explored in Chap. 3.

## **Starting to Think About Data Capability**

Moving non-proft data analytics out of an environment of research projects and experimental initiatives and into business as usual requires comfort with using data and understanding the roles of data across the organisation and beyond. As noted above, data capability can be understood as a holistic concept, and we explore this in more detail in Chap. 3. Building data capability is not just about buying software or employing data professionals. Rather, it involves deepening knowledge and expertise in connecting the goals and work of a non-proft—their mission—with resources enabling appropriate *use of data* to meet the goals. Tis includes profciency about what, where, why and how data is signifcant and why and how to use diferent data analysis techniques (Tripp et al., 2020).

It takes efort and commitment to grow organisational data capability, and there is a temptation to turn to commercial platforms and tools, like Amazon Web Services or Microsoft Azure, for data management and analysis. Te challenge with implementing such tools without an organisation having done the groundwork to gain data capability is that they apply advanced analysis techniques without transparency. An organisation that invests internal know-how into identifying and implementing tools and practices that match its needs will understand potential for bias and other data harms. While we do not explore use of artifcial intelligence (AI) in this book, it is coming and indeed already present in some non-proft operations and social service work. Non-profts that build their data capability will be resourced with knowledge to understand this application of data and to advocate and advise on ethical and wise use of advanced techniques.

In the context of non-profts' data work, we favour using the term *data capability*. *Data literacy* and *data maturity* are other terms applied to try to capture the idea of being 'ready' for using data. Te need for citizen 'big data literacy' is widely discussed (e.g., Grzymek & Puntschuh, 2019; Müller-Peters, 2020) in the context of 'data citizenship' (Carmi et al., 2020) as a response to expanded datafcation and algorithmic decision making. Sander (2020), for example, suggests this "goes beyond the skills of… changing one's social media settings, and rather constitutes …[being]… able to critically refect upon big data collection practices, data uses and the possible risks and implications that come with these practices, as well as being capable of implementing this knowledge for a more empowered internet usage" (p. 2). One problem with using the term 'data literacy' in the context of non-profts is that it tends to target the competencies and critical awareness of *individuals* (D'Ignazio & Bhargava, 2015; Frank et al., 2016) and thus seems less suited to considering organisation-level attributes.

Similarly, we are not enthusiastic about the term 'data maturity', even though it suggests organisation-level qualities, because it conjures up the notion of an ultimate 'fnish line' and doesn't account for the wide variety of circumstances that shape the use of data. We opt to talk about data capability because what we envisage are plural and dynamic qualities, situated historically and culturally, that are fundamental to fostering change across new socio-technological milieux. While ideas of data literacy and maturity help by compiling skill and competency needs, our approach is to democratise data practices, open up data expertise to all parts of an organisation and push it beyond the IT department or the bounds of appointing specialist data professionals. Our holistic conceptualisation of data capability resonates with Williams' (2020) depiction of 'data action' for public good—which is described as "a methodology, a call to action that asks us to rethink our methods of using data to improve or change policy" (p. xiii). Aligned with this call-to-action approach is a widening of data accountability, responsibility and ethics. In short, data capability involves more than ticking of attributes from a list but is about evolving understanding, resourcing, implementing and doing, involving people across organisations and in relevant communities, and interacting with changing contexts and missions.

In this book, we provide examples of how non-profts can use data and give practical strategies for non-profts to build data capability. Te central approach we ofer for building new capability is *collaborative data action*. Rather than consigning data solutions to individual projects or teams, we encourage collaborative processes within and across organisations. In Chap. 2, we give case studies of using collaborative data action with non-profts to generate new insights from using and re-using data. In Chap. 3, we delineate the collaborative data action methodology and highlight why it is particularly useful for non-profts. Based on our research with non-profts, we distil out key issues for non-profts to prioritise. Our mission is to put data analytics within the reach of all nonprofts and to overcome isolationist and competitive data practices that concentrate capability with the well-resourced (large) few. Tat is, not replicating the logic of private enterprise, commercialism in data use and start-up culture exceptionalism.

Part of the 'magic' of collaborative data action is bringing together different knowledges, skills and experiences because data analytics for nonprofts is a hybrid activity (Verhulst, 2021; Williams, 2020). It requires the skills of data scientists, but they tend to lack social science training. It requires social scientists with grounding in evidence and methods of social felds, and it needs practitioners because they know the practices and operating contexts of non-proft work. As non-profts' capability is built, their data work increasingly must incorporate the voice and perspectives of clients, citizens and communities. To achieve this, it is necessary to navigate the problematic environment that has arisen due to some of the ways that social data analytics has been applied to date—that is, to address the (ab)use of data causing *social harm*.

## **Navigating Data Harms by Involving Citizens**

Part of the rationale for growing non-profts' data capability is to bridge the gap between desire to extract optimal social value from data, while addressing the risks from (re-)using this data. Much of the data nonprofts generate and work with is likely to be personal data about clients and customers, perhaps sensitive and health-related data. Accountability to clients, customers and communities around use and re-use of data is paramount and challenging to execute well. At this point, as good data safety practices and technology are available, challenges are mainly due to a lack of established, evaluated models of good practice of how to work with people to formulate governance principles and processes for re-using data about them. And, building on this, how to engage citizens as empowered partners in data projects that engage with their data.

Constructing sound practices for using and re-using citizen data requires citizens at the table. In our experience of data projects with nonprofts, they fnd it challenging even to think about holding discussions with clients and consumers about how to develop such practices. Tey appear afraid to mention 'the d word'. Tis fear of engaging with clients and consumers regarding data is linked largely to perceptions of risk due to high-profle accounts of social data *misuse*. Critical accounts of datafcation emphasise the way data has become a social and political issue "not only because it concerns anyone who is connected to the Internet but also because it reconfgures relationships between states, subjects, and citizens" (Bigo et al., 2019, p. 3). Accounts about the impact of datafcation on society are multiple and sometimes depict grave consequences. Tey exemplify harms from use of data analytics in replicating and driving inequalities of race and ethnicity, gender and class, and concentrating power in the globally dominant technology corporations (e.g., Criado-Perez, 2019; Eubanks, 2018; Noble, 2018; O'Neil, 2016; Srnicek, 2016). High-profle failures to use data and technology in social welfare settings, for example in Australia, the notorious failed Federal Government 'Robodebt' automated debt recovery programme based on welfare services data (Henriques-Gomes, 2020), are mirrored internationally. Such cases have eroded public confdence in institutions that would traditionally be trusted to care for and about citizens and data.

Diferent countries and regions are beginning to clarify data rights and heighten the accountable, responsible production and use of personal and social data through high-level legislation, such as the European Union's General Data Protection Regulation (GDPR) (European Parliament and the Council of the European Union, 2016), and proposed Bills to regulate Artifcial Intelligence (AI). However, there is still ongoing uncertainty about what rules pertain in diferent contexts—and even how to fnd out. Data security and privacy law and responsible data governance are core elements of the context of non-proft data analytics, but we also note that risk aversion around working with data can be the immediate, and apparently easiest, response. Among non-profts highly sensitive to social injustice, vulnerabilities and systemic inequality, the idea of doing more with client and citizen data can be met with considerable anxiety, resulting in waiting until things get clearer (i.e., not re-using data). We suggest a key reason why non-profts should grow their data capability is so they can confdently and competently engage with clients, citizens and communities around responsible data use. While there are risks, and a need to proceed with caution, using citizen data for insights could bring benefts to clients, customers and the wider community. Data is already generated, so it is responsible re-use that is the central issue to be resolved. Tere are, arguably, three key issues to be considered in non-profts working with citizens and data: (1) developing sound data governance practices, (2) working with citizens to gain insights from data and (3) raising citizen data literacy and community data capability.

Some researchers have begun to explore how to involve 'lay' participants in discussions around responsible use of data. For example, the Data Justice Lab (Warne et al., 2021) produced a civic participation guidebook outlining participatory methods including citizens' juries and mini-publics (deliberative conversations) to discuss data use. Living labs and hackathons are other methods discussed (e.g., Flowing Data, 2013). Tese methods, though, tend to engage citizens in discussing large administrative or government datasets, rather than making direct links between citizens and re-use of data about them. Tere are some cases of active engagement of citizens with deciding about uses of their own data; for example, the Salus health data co-op in Barcelona involves people making decisions about selective use of their data (e.g., for health research), as opposed to making it entirely open or private and unavailable for re-use (Calzada, 2021). Open Humans (https://www.openhumans.org) is a non-proft dedicated to supporting individuals and communities to explore use of their data for social purposes. We found a few examples of engaging more marginalised groups about their data, and these are the citizens with which non-profts are most likely to work.

Here, perhaps, work on *Indigenous data sovereignty* indicates a useful way ahead (Kukutai & Taylor, 2016). Data sovereignty is a way of understanding the importance of establishing consent and respecting the rights of, and ensuring benefts for, those who are the subjects of data (Carroll et al., 2020). In many parts of the world, Indigenous data sovereignty working groups and scholars are defning and addressing data inequalities and exploitation among those who have had least control and beneft over data collected about them. Carroll et al. (2020) discuss the process and rationale for developing the *CARE Principles for Indigenous Data Governance*. CARE stands for: Collective beneft, Authority to control, Responsibility, Ethics; and the principles are intended as a guide for stewardship and processes to enable self-determining citizens to make decisions relating to collection, storage, analysis, use and re-use of data. Te CARE principles were developed by Indigenous people due to widespread abuse of data about them involving issues of over-surveillance, use of data for policing, lack of transparency and control, and under-counting (thus under-representation). Data is as important to the sovereignty of a people as language, artefacts, landmarks, beliefs and cultural knowledge, and natural resources. As Tahu Kukutai and John Taylor eloquently argue: "missing from those conversations have been the inherent and inalienable rights and interests of indigenous peoples relating to the collection, ownership and application of data about their people, lifeways and territories" (Kukutai & Taylor, 2016, p. 2). Indigenous ways of knowing can ofer new models for data governance that are built on collaborative, rather than individual or proprietary responsibility, and more respectful forms of consent. Work on Indigenous data sovereignty can ofer principles for wider application to engage with citizens represented in data and who have experienced power inequities.

Moving beyond citizen engagement in designing data governance, clients and consumers should be engaged where non-profts re-use data about them. Tis could involve data analyses relating to, for example, situational awareness, impact assessment or for community insights. Tis goes beyond acknowledging people's representation in the data, but also acknowledges their vital 'lived experience' roles in ground-truthing and interpreting 'what is going on' in data analyses. Most contemporary nonprofts have established relationships and ways of engaging lived-experience clients and customers in informing and enabling services so engagement with data analytics would represent an extension of such work. Partnering with citizens about data is important for informing the work of non-profts, and, as such, should be appropriately recompensed. Tis acknowledges the expertise of citizen clients and customers as key stakeholders in use, visualisation and interpretation of data *that is about them.* As Williams (2020) notes, involving citizens is integral because "data are people" (p. 220).

Some excellent examples of resources for involving citizens with lived experience in data projects have been generated in recent years through work of Elsa Falkenburger, Kathryn Pettit and others at the Urban Institute and specifcally its National Neighborhood Indicators Partnership (NNIP; https://www.neighborhoodindicators.org/). Tese community data advocates devised a 'data walk' methodology to engage citizens with analysed and visualised datasets to help make decisions about their communities (Murray et al., 2015). More recently, a short *Guide to Data Chats* resource has been produced for practitioners, giving really practical advice and tools for involving citizens with data (Cohen et al., 2022). As part of the NNIP's projects, citizens are often trained to collect new, granular 'citizen science' data about aspects of living in the locale.

Te NNIP sees building community data capability as a key outcome of engaging citizens in data projects. In their role as engaged with clients and customers, non-profts could be signifcant in developing citizen and community understanding around ethical data collection and use. As digital inclusion becomes central to social equity agendas, non-profts' data work with clients, customers and citizens could move beyond service delivery and contribute to a wider social mission of building client data literacy. Tis could be done by engaging people with their data, discussing issues such as sovereignty and potential to re-use data and generating co-designed data governance. Such activities would centre clients and consumers in non-profts' data practices and contribute to building data capability at community level.

Initiatives around the world are working to provide examples of ways to engage citizens, for example Our Data Bodies (https://odbproject.org) is a project working with low-income people in the US and data rights, and Amnesty International is engaging with data volunteers to help organise crowd-sourced datasets (Acton, 2020). However, specifcally considering the range of large and small non-proft organisations, our experience of current practice is that non-profts' engagement of consumers and clients in re-use of their data does seem to present quite a leap. Most nonprofts we have worked with are still at the stage of building their own internal data capability. As Sander (2020) concludes—with regard to citizen engagement—there is, as yet, "too little knowledge on what kind of literacy eforts work best and a lack of constructive or comprehensive research on how to address people's lack of knowledge" (p. 1). We argue that non-profts' management, boards and staf require their own data knowledge, awareness and experience as a precursor to engaging clients appropriately in conversations about data and involvement in codesign of data use practices. Tis is not ideal but realistic based on our experiences. Until this time, it is imperative that non-profts understand the consent they have to gather, using this knowledge to work within general ethical parameters (Williams, 2020).

## **Key Takeaways from This Chapter**

In this chapter, we set the scene and introduce some key ideas about why and how non-profts need to engage with data analytics. Te key points we'd like readers to take away are listed below.

#### Key Takeaways


In the next chapter, we present case studies that illustrate our journey of working with non-profts and data, from an earlier example of working largely with social media data and government consultation submissions, to working with non-profts exploring their own data, to generating a data collaborative with non-profts and other organisations taking a place-based approach (Chap. 2). We present our case studies in Chap. 2, to give a picture of the diferent kinds of data projects we are talking about in this book, but also because it was working on these projects that led to the understanding of data capability we suggest here and our appreciation of the benefts of working collaboratively. In Chap. 3, we build out from those learnings from the case studies. We more fully describe what data capability for non-profts looks like and outline the collaborative data action methodology that we generated and refned while working on the case study projects and refecting on similar work elsewhere. In Chap. 4, we look to the future—discussing the way ahead for nonprofts and data analytics for social good and suggesting research and practice priorities. Data practices and regulation are dynamic and rapidly changing so there will be new work that constantly refreshes and extends what we say here. Our focus in this book is on what we gleaned from very practical projects with practitioner partners. We note the book does not provide a comprehensive international scoping of all uses of data for social good or initiatives. Rather, here we tend to highlight the initiatives and resources that we have drawn on most in developing our work (see appendix for specifc detail of these). We hope this book gives help and inspiration to non-profts seeking data analytics for social good and researchers working alongside them.

## **References**

Acton, D. (2020). *Designing Amnesty Decoders: How we design data-driven research projects.* Amnesty International: Citizen Evidence Lab. Retrieved July 15, 2022, from https://citizenevidence.org/2020/10/09/designing-amnestydecoders-how-we-design-data-driven-research-projects/


A. Zimmer (Eds.), *Te third sector as a renewable resource for Europe* (pp. 49–93). Palgrave Macmillan.


Warne, H., Dencik, L., & Hintz, A. (2021). Advancing civic participation in algorithmic decision-making: A guidebook for the public sector. *Data Justice Lab*. Retrieved April 7, 2022, from https://orca.cardif.ac.uk/143384/

Williams, S. (2020). *Data action: Using data for public good*. MIT Press.

Wilson, E., Campain, R., & Brown, C. D. (2021). *Te community services outcomes tree. An introduction*. Centre for Social Impact, Swinburne University of Technology. https://doi.org/10.25916/7e8f-dm74

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

Te images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **2**

## **Case Studies of Data Projects**

Chapter 1 made the case for non-profts building their data capability as part of enabling their work for social good. Tis chapter jumps straight into the reality of how organisations start to work with diferent types of datasets and learn about working with data. We present three case studies of our own research working with diferent non-proft (and other) organisations and diferent internal re-used datasets, as well as open public datasets. Each case study features collaborative data action and—we argue—results in steps towards data capability. We jump straight to the projects here because this is really what happened in our work. We took our skillsets from our diferent research backgrounds—approximately data science, communications and community development—and looked at how we could partner with organisations to address their real challenges. As well as having a problem to solve, each partner organisation we worked with also had a curiosity to fnd out about whether data science could help. In our frst case study, we worked with government departments and agencies to understand the public conversation on family violence and the impact of policy. For the second, we partnered with three non-profts looking to solve social problems with data. Our fnal case study is a collaboration with several community organisations and a bank in a regional city. Te case studies illustrate the evolution of our work with data over 2017–2021, and how we came to arrive at collaborative data action as a methodology as it was trialled and refned over a series of studies. Tere are hints about what building data capability involves in each case study, but we only started to build in processes of evaluation as our studies progressed. Hence, the case studies have slightly diferent formats. And only over this evolution of cases and other data projects have we arrived at our understanding of data capability. Tis is explored in Chap. 3.

We suggest the case studies show how data projects that involve social mission-driven organisations beneft from combining multiple skills and perspectives. Tis is because applying data science in domains of social action is complex. It benefts from knowledge of relevant evidence, acknowledging that ideology and values are always present, and above all it benefts from practitioner expertise through their experience working in contexts that highlight what is signifcant and how to address it. Our case studies are light-on regarding the techniques of 'big data' science because this is not a book on how to do data analytics technically. Tat is covered in other texts (e.g., Aragon et al., 2022). In this chapter, we focus more on *what we did* from an operational, indeed co-operational, standpoint. We expand on what that means—the implications and how to build data capability—more in Chaps. 3 and 4. Case study projects 2 and 3 took place during 2020–2021 during the COVID-19 pandemic when extended lockdowns meant a lack of face-to-face engagement. Te case studies are as follows:

Te project featured in Case Study 1 involved re-using data for insights into the public conversation about family violence following implementation of new state family violence policy. Working mainly with a government department concerned with family violence policy, but also in consultations with non-proft stakeholders, the case study addresses how to gain information about social outcomes by re-using qualitative datasets generated via social media and public consultation. It thus exemplifes some of the kinds of datasets, analyses and visualisations that non-profts could use when looking for novel data to inform outcomes evaluation.

Te project in Case Study 2 involved working with three non-profts of diferent sizes. Tey partnered to learn if and how they could use internal already-generated data to create added value, particularly around showing their organisations' direct and wider social impacts and, on the other hand, to improve organisational efectiveness.

Case Study 3 illustrates how seven organisations, including non-profts and a bank, worked together to fnd out if and how they could use their internal data, plus open data, to fnd out more about their community. Tey brought data together to generate geospatially visualised data layers describing community resilience, including layers about social connection, fnancial wellbeing, homelessness and housing, and demand for social services. Te case highlights some of the potential and challenges in sharing data amongst organisations.

Table 2.1 summarises the case studies including an overview of the topic and nature of the collaboration, datasets used, analyses and visualisations and key learnings.

At the end of this chapter, we compare some aspects across the cases, mainly considering what was learned as this informs the themes about building capability and collaboration that are extended in Chap. 3.

## **Case Study 1: Outcomes of Family Violence Policy—A Public Sector Collaboration**

#### **Project Goal**

Explore the value of novel datasets to inform the State Government of Victoria, Australia, about changes to the public conversation after it introduced new policies to address family violence.

#### **Project Description**

Te Victorian Government produced new family violence prevention policies in 2017 in response to a Royal Commission investigation (2015–2016). Alongside recommendations for public and community sector reform, the government produced a framework of outcome indicators. Tese tended to refect aspirations for change and were considered


**Table 2.1** Summary data projects case comparison

difcult to measure, particularly those related to improved awareness, understanding and attitudes about family violence in the community. Some of the outcomes were complicated to assess; for example, while the policy sought a "reduction in all family violence behaviours" (State Government of Victoria, n.d., p. 6), family violence incident reporting rose, possibly because people were more comfortable with coming forward and were supported to do so with better services. Simply measuring changes in crime statistics, therefore, gave potentially misleading results.

We worked with government and government agency partners to target outcomes relating to changes in public discussion. We assessed changes by analysing: (a) the public consultation submissions that informed the new policy (to establish a baseline of core family violence issues) collected in 2015 and (b) public discussion through social media data (Twitter) and news media reporting to understand how the public conversation changed in response to public policy during 2014–2018.

#### **Collaborating Partners**

Te project was instigated by the Victorian Department of Premier and Cabinet (DPC). Te DPC leads the whole of Victoria state government policy and performance, coordinating activities to help the government achieve its strategic objectives.

Other partners that collaborated on this project were:


#### **How the Project Began**

Te project started with discussions with the DPC in mid-2018 about the feasibility of re-using external data sources to inform outcomes. Tis was an exploratory project and, as a frst step, our DPC partners spent several months identifying a suitable topic and group of stakeholders. Criteria for selection were as follows: that it should be a non-controversial topic area; there should be pre-existing good relationships between relevant agencies and departments; and stakeholders were open to novel data analytics. Te DPC had its own Business Insights Unit that analysed data, so these staf were involved with the aim of complementing, not replicating, the work they were already doing. Initial workshops were held involving our multi-disciplinary university researcher team and partner staf, and this led to identifying data sources and likely useful types of analysis.

## **Summary of Datasets Used**

Data sources (see Table 2.2) were selected to provide insights into public discussions about family violence over the fve-year study period, allowing comparisons year by year.


**Table 2.2** Data sources for public discussion of family violence

### **Methods**

**Discussion Workshops** A steering group with representatives of project partners met six times during the project. Early workshops established questions to pursue in the data analysis and identifed a timeline of policy events from 2014. As data was analysed—and explored through subsequent workshops—the group gave feedback on fndings and input to aid further analysis. Trough these workshops, a *collaborative analysis* strategy was developed.

**Data Analysis** Data analysis techniques were chosen to ft datasets and project goals. To discover semantic patterns within the large bodies of text data from the three datasets, natural language processing (NLP) was used to augment qualitative content and thematic analysis. Tis involved word frequency and clustering analysis, using Pearson Coefcient Correlation analysis (Pearson's *r*), and the topic modelling method Latent Dirichlet Allocation (LDA). Te approach to analysis is informed by established theory in policy analysis, frame analysis and socio-linguistics that addresses the formation of public social issues and understands the role of language and communication in 'framing' or shaping and contesting the parameters of those issues.

A timeline analysis of the Twitter dataset identifed peaks in discussion across the fve-year timeframe and matched these with known policy or public events. Named entity recognition was also used to identify key individuals and organisations and their prominence at diferent times.

**Submissions to the Royal Commission Public Inquiry (2015)** Te sample of public submissions was analysed using word frequency and thematic clustering, as well as qualitative content analysis to establish a baseline of the key policy dimensions framing family violence. Te submissions were taken as a proxy for the attitudes and topics discussed by an *informed public*—that is, the diverse individuals, community sector and services, government and research voices, who have experiences of family violence or work with victim survivors or perpetrators.

**Twitter Corpus (January 2014–December 2018)** To identify topics in the Twitter dataset over the target timeframe, a sampling strategy was used, generating a maximum of 500 tweets per week. To inform the timeline analysis, this sample was supplemented by extracting the *Twitter counts endpoint* which returns the total tweet count at each timepoint. Tis allows quantifcation of tweets beyond the 500 per week sample.

LDA topic modelling was applied to Twitter posts for each year. Since LDA is an unsupervised learning model, there is no ground-truth on the number of topics, and therefore it is the researcher's responsibility to validate the appropriate number of topic clusters. For our study, the number of topics identifed for each year is established by model parameter checks. Te topic modelling process established a range of topic options, and these were reviewed by the researchers on the team to identify the most coherent and distinct topics, with the number of topics varying each year.

**News Media Corpus (January 2014–December 2018)** Te meta-data captured via the API for each article included the source name (media outlet), time and date of the article. We cleaned the media dataset by scraping the body of the articles from provided links. Stories with invalid URL links and duplicate stories published in more than one outlet were removed, retaining the frst published article. LDA topic modelling was applied to the news media corpus, and a hand-annotated topic descriptor was associated with each cluster.

With all the datasets, reliability of machine analysis was checked by manual qualitative coding of samples of data items (tweets, stories and public submissions) and inter-coder reliability checks involving four people independently coding samples. Te team checked emergent topics against the outcomes framework we were seeking to inform, existing research evidence and the Royal Commission reports.

## **Findings**

We reported a range of fndings that helped identify the longer-term changes in the way family violence was discussed and were able to estimate the main efects of the Royal Commission and subsequent policy initiatives. Tese changes, observable through the diferent public discourse datasets (news, Twitter, public inquiry submissions), were mapped against the government's ofcial outcome indicators. A number of diagrams and chart types were chosen to present the most salient fndings. Tese choices matter, and working with large corpus natural language or text datasets meant that innovative techniques had to be used to convey fndings concisely and dynamically.

A *tree diagram* was used to visualise fve core thematic dimensions of family violence identifed through analysis of the Royal Commission public submissions and policy reports, which were victims, perpetrators, causes and contexts, systems, and solutions. Tese dimensions served as a baseline and were used to compare changes to the public conversation thereafter.

Two *standard graphs* were used to quantify public discussion of family violence, and show change over time, against the fve Royal Commission dimensions. Tis revealed alignment and divergence between public discourse and policy frameworks.

Two *ribbon graphs* (see Fig. 2.1) were used to represent and quantify the change in news media and Twitter topics, between 2014 and 2018, and the continuity and discontinuity of those topics. We drew out insights from this analysis. For example, in Twitter data, victim survivors and perpetrators are discussed more directly and pointedly, and victim survivors voice their own experiences, to a far greater extent than in news media and policy reports and inquiry submissions. At a high level, we showed that the public conversation changed in relation to the 2015 hearings of the Royal Commission and policy framing. Unlike Twitter, which consistently followed the hearings and amplifed the issues it raised, news media reporting was much slower to change or respond to the Royal Commission. Te news coverage only took of with the rise of the #MeToo movement in late 2018.

A *Twitter timeline graph* identifed key public events against peaks and troughs in Twitter activity (Fig. 2.2). Tis helped to discover when there was attention to key policy events and other infuential public actions and controversies.

*Bubble charts* were also used, drawing on named entity analysis, which quantifes mentions of people or organisations in the data. Tis showed the relationship between Twitter and news media items by key topic area and infuential people and organisations. Tese changed over time.

**Fig. 2.1** Topic modelling analysis of Twitter topics related to family violence 2014–2018. *Note*: Ribbon graph adapted from data in "Community responses to family violence: Charting policy outcomes using novel data sources, text mining and topic modelling". by A. McCosker, J. Farmer, and A. Soltani Panah, 2020, *Swinburne University of Technology*, p. 24, https://apo.org.au/sites/default/fles/ resource-fles/2020-03/apo-nid278041.pdf. (Copyright 2020 by Swinburne University of Technology. Adapted with permission)

Trough the named entity analysis, we identifed key players in the public debates surrounding family violence over the target period. Tis included politicians, advocates and activists, as well as news organisations.

## **Outcomes and Lessons Learned**

Te data analysis gave fresh insights relating to how family violence was discussed and changes over time post-policy change. It showed the DPC that there were datasets that could inform their outcomes about public attitude and public discussion changes. Where they had previously relied on community surveys that tend to feature limited demographics in response, by re-using other datasets they could access a wider range of attitudes and language. Analysis raised new issues that they had not thought about previously, such as what topics were featured in policy

**Fig. 2.2** Timeline and peaks of Twitter activity addressing family violence by year (2015 and 2016 represented). Note: Twitter timeline analysis graph adapted from data in "Community responses to family violence: Charting policy outcomes using novel data sources, text mining and topic modelling". by A. McCosker, J. Farmer, and A. Soltani Panah, 2020, *Swinburne University of Technology*, p. 29, https:// apo.org.au/sites/default/files/resource-files/2020-03/apo-nid278041.pdf. (Copyright 2020 by Swinburne University of Technology. Adapted with permission)

compared with public concerns. For example, there was limited and abstract discussion of perpetrators, but as time passed, there was more nuanced discussion on Twitter about men as perpetrators and social and structural factors infuencing family violence. Tat the news media continued sensationalising tropes about violence showed that government still needed to do more to infuence news media reporting. Tey found out that the public uses diferent and diverse words (compared to policy) to depict and discuss forms of family violence, particularly using the term 'abuse'. An evolving timeline of public responses highlighted that policy events infuenced volume and duration of peaks in Twitter discussion more than some very serious crime events. Analyses also highlighted how particular people and organisations infuence the conversation in diferent directions. Together, the analyses gave a much more nuanced perspective about how the public responds to policy that could inform useful changes to policy over time.

Te project featured collaborative research around evaluating outcomes in relation to a signifcant social policy issue with government departments and arms-length agencies. As such, it showed that through collaborating to bring multiple knowledges and skills to the table, existing data could be re-used to fnd evidence, rather than collecting new data. We introduced new types of data and analytical methods and showed how partners' current social media analysis could be refned and extended.

Te work led to our research team developing ongoing relationships with the departments and agencies. Specifcally, it also led to a presentation at a key government knowledge transfer event and to newly funded research about accessing, integrating and analysing the government's longitudinal datasets on family violence.

Re-using data and using novel data analytics techniques is challenging, and in large, traditional, bureaucratic organisations requires determined champions to drive experimentation and change. While we were fortunate to work with a series of senior advocates within government, the project was hampered by multiple senior staf changes throughout the study period, afecting continuity, support and understanding of the work.

Te collaborative processes we used may appear time-intensive, but they ofer substantial methodological benefts from bringing in diferent expertise, perspectives and questions and achieve direct impact in infuencing knowledge and awareness about data amongst those that participate. Potentially, these representatives are inspired to return to their departments and agencies and be more confdent about advocating for using data and growing skills in data use.

For further information about the project see McCosker et al. (2020).

## **Case Study 2: Re-using Operational Data with Three Non-Profts**

## **Project Goal**

Explore the relevance and feasibility of data analytics for non-profts through deploying a collaborative data action methodology.

#### **Project Description**

Australian non-profts are aware of the rise of the data analytics movement, but many lack the capability and resources that would allow them to fully utilise their data via analytics. Te three non-proft partners in this project provide services for diferent target groups and have diferent existing requirements to use data—including to report to external funders and government regulators. Each has gathered a set of datasets over a number of years in relation to their work.

We facilitated a series of iterative workshops with staf to identify their organisational 'pain points' (i.e., problems and questions), understand their datasets and determine if and how data analytics could be used to provide new insights that could guide future strategies. We also developed a series of educational webinars about working with data, including information on relevant laws, local policies, technological tools and open data portals. Non-profts' staf were interviewed at the beginning of the project to assess aspects of their existing organisational data capability and their hopes and expectations. Interviews were repeated at the end of the project to discover benefts and refect on learning and challenges.

Te project ran from 2020 to early 2021. While originally we envisaged multiple face-to-face meetings and training sessions, ultimately all sessions were conducted online. Both non-profts' staf and researchers spent several months in lockdown due to the COVID-19 pandemic and dealt with multiple operational challenges while they participated in the project.

#### **Collaborating Partners**

Te project was funded by the *Lord Mayors Charitable Foundation* (LMCF) (a philanthropic foundation based in Melbourne), the nonproft organisations that participated, and a small grant from our university. Te non-proft partners were:

1. *Yooralla,* an organisation providing services for people with disabilities in their homes and the community.


## **How the Project Began**

Leaders at the LMCF partnered with our team because they were interested to explore the potential of new capabilities in understanding and using data from partnering with a university data lab to fnd, examine, analyse and visualise data.

Once initial partial funding from LMCF was secured, the next step was to identify and attract three or four non-profts that would also cofund their participation. Establishing agreement from the non-profts to participate sometimes took several conversations over two to three months, involving researchers, non-proft managers and staf. Te researchers shared examples from past data projects, as well as gave examples from initiatives like Te GovLab (https://datacollaboratives.org) and NESTA UK's data analytics projects and reports. While there was strong initial interest from potential partners, negotiating to the point of securing participation and funding was a signifcant challenge. As the COVID-19 pandemic hit, one partner (a large community health service provider) was forced to withdraw to focus on core business.

#### **Summary of Datasets Used**

We focused on re-using non-proft partners' internal datasets but drew on open public datasets to support and complement these datasets, helping to produce new insights (Table 2.3).


**Table 2.3** Datasets used in the three non-profts' analyses

## **Methods**


Te non-profts were responsible for identifying relevant internal datasets and ensuring these were de-identifed according to the Australian *Privacy Act 1988*. Tese datasets were shared with Swinburne researchers via SharePoint (a secure enterprise fle-sharing platform).

Following workshops 1 and 2, the research team's data scientists worked with non-profts' staf to generate visualisations based on partners' internal datasets. Following workshop 3, some open public data sources were analysed and visualised to compare or add value to internal data analyses. Tese processes involving non-proft staf in processes of cleaning, obtaining, analysing and visualising data provided opportunities for non-proft staf to identify potential value from data analytics as well as to understand the work, technologies and governance issues involved. Collaborative working between university and non-profts' staf inspired discussions about future investments in data science capabilitybuilding for their organisations.

Te workshop approach drew on aspects of the *data walk* method pioneered by the Washington DC based Urban Institute (Murray et al., 2015). Tis method focuses on visualising data and sharing and discussing visualisations as a method of collaboration, participation and iteratively honing analyses to address participants' questions.

#### **Data Analysis**

**Entertainment Assist** Data scientists from the research team worked with Entertainment Assist to generate several diferent visualisations using the *Intermission* course evaluation survey data. Descriptive statistics and sentiment analysis were applied. In workshop discussions, diferences between managers and staf cohorts undertaking the training were identifed, and this drove a next round of data analysis further exploring the responses from these groups. Workshop 3 raised the idea of comparing programme participants by job, as those taking the course range from young performing artists to older technical staf. Word clouds, sentiment analysis and other types of statistical analyses compared data from the Intermission dataset with data from the Australian Bureau of Statistics' Australian National Survey of Mental Health and Wellbeing. Te comparison generated new insights about the potential impacts of the Intermission programme for particular at-risk cohorts as highlighted by national data.

**Good Cycles** Data about training by employee from the Transitional Employment Program dataset was initially used to generate an analysis of tracking workers' progress in building employment skills over time. Tereafter, worker journey data was used to generate a geospatial visualisation of data showing 2514 trainees' bicycle journeys during the course of service delivery over three months. Bicycle journeys were visualised as trails on a map of Melbourne's suburbs.

Building on these initial analyses, geospatial data about trainee journeys from Good Cycles facilities to customer sites was compared with environmental modelling data from the City of Melbourne Transport Strategy 2030 (City of Melbourne, 2020) to help calculate the environmental benefts, in terms of reduced trafc congestion, reduced carbon emissions and improved citizen health outcomes, of employees travelling by bicycle as opposed to car or truck.

**Yooralla** Yooralla was interested to improve staf experiences of work, and analysis began by examining internal operational human resources and training datasets. Geospatial and temporal visualisations were initially generated, showing aggregated data about staf demographics, rostering history and training by Yooralla service location. Tereafter, an objective became to discover variables linked to staf retention, and one target suggested to explore was to compare staf demographics with distances travelled to reach workplaces. A key question pursued was might distance travelled to their workplace infuence staf retention?

For discussion at workshop 3, datasets analysed included Australian Bureau of Statistics (ABS) data about median levels of general population employee income across Melbourne, compared with geospatial postcode data for Yooralla employees and geospatial postcode data about employees' primary workplace (ABS, 2020a). Datasets were compared for any insights relating to associations between median income for suburbs and staf home and work locations.

## **Findings**

**Insights from Data Analyses** Each non-proft participated in generating analyses and visualisations that they considered helpful in understanding and explaining the challenges they brought to the project. As examples, staf of Entertainment Assist were able to better understand the signifcance of their training course for particular target groups and to consider how training might be tailored for diferent groups. For example, young, mostly female dance students and stagehands who are mostly middle-aged men would both be key target groups but would need differently confgured training content.

Data analysis and visualisations generated allowed Good Cycles to demonstrate their contribution to the environmental sustainability of Greater Melbourne because the impact of employees' travel by bicycle could be calculated in terms of impact on congestion, emissions and public health. Figure 2.3 provides an indication of how Good Cycles' employees journey data can be shown. Tis particular depiction selects out only three cycling employees' journeys across Melbourne from the Good Cycles' depot but serves to show the type of geospatial visualisation that Good Cycles found useful.

Insights for Yooralla included understanding the impact of the locations of their service hubs (often in higher income suburbs) in relation to where

**Fig. 2.3** Geospatial visualisation of three Good Cycles' employee journeys

their staf could aford to live (a majority resided in mid-lower income suburbs). Disparities meant staf had long journeys to work and this potentially related to staf retention. Trough a visualisation of internal and ABS employment and income datasets, Yooralla saw that the average daily commute for their employees was nearly 60 km return journey. Tis is considerably further than the average Australian commuting distance (ABS, 2020b). Tis led the Yooralla team to consider whether new work practices and staf work locations could be signifcant when trying to improve staf retention. Insights generated from the work ultimately led Yooralla to develop new policies for employee rostering.

## **From the Before and After Interviews**

Te non-profts' managers shared their initial goals for participating in interviews held at the start of the project. Te main themes are summarised below, with illustrative quotes.

*Improve organisational data know-how*: "Te best-case outcome is that … we improve our defnitions, we improve our measurement, and we improve our data collection … and we have a culture, we have a discipline around capturing data" (Entertainment Assist).

*Inform organisation strategy*: "I think we've got very rich data. We've got a lot of data. And obviously, it's getting through all of that information and providing it that will inform change, that will inform improvements, that will make changes for the better"(Yooralla).

*Generate new insights*: "I think there is an opportunity…to look at what other areas we could be exploring with this data. I think there is an opportunity to actually look at all the information that we have—and look at it in diferent ways, and look at it in more meaningful ways" (Good Cycles).

*Show outcomes and impacts to funders*: "Obviously there are a number of incredibly generous philanthropic organisations out there and seeking support for particular programs and projects is an important part of our work. [Tis project] … helps us to quantify some of the outcomes that we're seeking to achieve" (Entertainment Assist).

At the end of the project, participants identifed immediate benefts from using data visualisations in reports to board members and funding bodies. For example, Good Cycles used a visualisation as part of a competitive tendering process to show the advantages their use of bicycle transport had for the environment:

[Te client] said, 'What's your footprint? What sort of area can we cover?' So, I got [Swinburne data scientist] to send me the heat map … I packaged that up and we sent that back to the client, to demonstrate how far north of the CBD [Central Business District] we go, how far south-east and west. It was good, it was a valuable piece of data. (Good Cycles)

All participants reported that the iterative workshop discussions of visualised data helped them to understand challenges and impacts associated with using their data which built their skills for working with data. One organisation, for example, realised there was a need to streamline current use of open text in reporting processes to generate more consistent and useful information:

People would put in the same concept [into the database] in 40 diferent ways … [It was] a bit of a wake-up call for us, and it really clarifed that there's only fve major classifcations that we want to look at in terms of risk, and that it's actually easier for us to show what the problems are to stakeholders if we just use fve risk classifcations. (Yooralla)

#### **Outcomes and Lessons Learned**

Te project took a long time to start, partly due to challenges of the pandemic and lockdowns, but also because potential partner non-profts were uncertain about committing to participation. In preliminary interviews, staf 'confessed' their lack of formal training in data analytics or their lack of experience with specifc tools or resources for managing and visualising data. Some expressed embarrassment about the 'messiness' of their organisation's data. While most participants worked with data to some degree, all assessed their understanding of data practices as limited.

Concern was particularly acute where large volumes of data were already generated. Participants discussed workarounds to deal with poor systems or their lack of know-how. For example, one participant described downloading datasets from the organisation's proprietary human resources software, which they then manually imported into Excel to generate monthly reports.

A key fnding from the project was that through collaborating with the university team, non-proft staf and leaders developed a diferent philosophy of thinking about data. Tey started to view data, its collection, and stewardship as a resource management issue, with datasets as resources that were useful to them depending on their skills and knowledge around using them. Tis was a shift from thinking about data as a compliance issue, something they *had to do* to assuage funders and regulators. Nonproft participants started to think about protecting and owning the value in data with an eye to the insights they could glean from diferent types of analyses.

Despite multiple challenges caused by working during the pandemic and its lockdowns, project aims were met. Unforeseen impacts included participants reporting that working with data sparked new collaboration between internal staf teams that had previously been siloed. Tis prompted new thinking about ways the combined teams might work with other organisations to combine resources and build data collaborations.

For further information about the project, see Albury et al. (2021).

## **Case Study 3: City of Greater Bendigo Data Collaborative**

## **Project Goal**

Assess the feasibility and potential benefts of a community data collaborative.

#### **Project Description**

Place-based planning and collaboration to address community challenges is encouraged in Australian government policy (Government of Victoria, 2020). However, planning for rural places is challenged by lack of data at meaningful spatial levels (Payton Scally et al., 2020). Forming a data collaborative could help by enabling re-use and pooling of data from multiple sources, including non-profts' internal data and open public data. In this project, seven organisations collaborated with university researchers to test the feasibility and potential of pooling and sharing data. Te City of Greater Bendigo covers a population of 120,000 living in urban suburbs and rural localities. It is 153 kms (two hours' drive) from central Melbourne, the capital of the state of Victoria, Australia. Working with managers of the partner organisations, the project identifed, obtained, analysed and visualised open public datasets and organisations' internal datasets, with mainly geospatial analysis and visualisation by suburbs and localities. During 2021, a series of workshops involving organisation staf and researchers were held to discuss topics of interest, identify datasets, consider useful ways to analyse data and then to discuss mainly geospatially analysed and visualised of datasets. Ultimately, this process informed development of a prototype *community resilience indicator dashboard*.

#### **Collaborating Partners**

Partner organisations included a national bank; City of Greater Bendigo council; Haven Home Safe, a non-proft homelessness services provider; Murray Primary Health Network, a government-funded primary health services commissioning organisation; Women's Health Loddon Mallee, a women's health service; and Bendigo Community Health Service and Heathcote Health Service, two community healthcare providers servicing diferent parts of the City of Greater Bendigo area. Our Swinburne University Social Data Analytics Lab team worked alongside the community partners.

#### **How the Project Began**

Te project started because a community health service manager was interested in exploring whether a data collaborative could help to overcome lack of data to help assess services' impacts on local health and wellbeing. Te manager mobilised a group of other managers of local organisations to form a data collaborative working with our team of data science and social science researchers.

An initial workshop discussed practicalities of data collaboratives and presented examples of international community data initiatives, such as those led by the National Neighborhood Indicators Partnership and Te GovLab. Following this, the organisations each contributed to a fund (to an approximate total of US\$50,000) to form a data collaborative, and they nominated a lead organisation. Teir self-organisation meant the partners committed to work with each other from the start.

As well as an overall contract between the university and the lead organisation, individual data-sharing agreements had to be established between the university and each organisation. We provided a standard template, but each organisation had to generate separately a data-sharing document agreed by their lawyers. Tis variously took one to fve months to organise. As each agreement was signed, we started working with their staf to identify datasets and analyse their data.

While established methodologies about the process of data projects emphasise the need to start with a focused problem or question (GovLab, 2022), our partners found it difcult to identify a specifc shared problem. All were interested in community wellbeing and resilience and potentially had datasets that could inform those topics. Consequently, we suggested developing layers of geospatially visualised data, each layer broadly relating to a community resilience topic. Given the partner organisations, the topic-focused data layers we suggested were social connection/isolation, caring, fnancial wellbeing, housing/homelessness and community health service use.

#### **Summary of Datasets Used**

We used open public datasets as well as re-using partners' internal datasets, as Table 2.4 shows.

#### **Methods**

**Discussion Workshops** Six workshops of organisation representatives were held at key stages. Early workshops established organisations' missions, topics of interest and relevant datasets. Discussions with organisations were ongoing between workshops, particularly about establishing


**Table 2.4** Datasets for community resilience data collaborative

data-sharing agreements. Datasets were analysed by the researchers in liaison with organisation staf and explored collaboratively through subsequent workshops. Tese revealed insights, as identifed by partner organisations, enabled discussion of caveats of the datasets and included and considered useful ways to present the data while maintaining unidentifability and paying heed to emergent considerations for partners. For example, we discussed how to present bank data—ultimately this was presented as an index of fnancial wellbeing, along with other relevant fnancial wellbeing datasets. Te workshop process helped to build relationships, mutual knowledge and trust between the partners, even though most workshops were held online.

## **Data Analysis**

Geospatial visualisation by suburbs was adopted as an analytical approach because most of the datasets had location data, and a place-based approach resonated with partners. As well as considering what open public data was available, each collaborating partner also worked to identify internal datasets that could be re-used and shared. A set of criteria drove identifcation of datasets to include, as follows:


Flexibility was required because some datasets were not analysable by suburb, meaning we had to explore other ways to analyse and present some data.

Once each organisation worked through the process of generating a data-sharing agreement, partner organisation managers then shared their dataset(s) with researchers in a suitable format for analysis. Some organisations were able to navigate this stage more quickly than others, depending on data governance practices and availability of dedicated data staf. It was particularly challenging (and for some organisations, impossible) to obtain aggregated data about health services.

Some requested help to export their data. Organising data by suburb was not a standard metric for all organisations. Some collect data at postcode or local government area (LGA) level, which was insufciently granular for the analyses sought. Suburbs have the disadvantage that they have highly varied population sizes, with some (especially rural localities) having small populations (sometimes <50). Tis makes it challenging to report results as unidentifable and reduces the reliability of the Censusderived datasets, because the Australian Bureau of Statistics (ABS) introduces deliberate errors when numbers are low, to protect privacy.

Given the caveats above, datasets were aggregated by suburb where possible and then combined into a single table using the R programming language. Te data was exported, joined to a shapefle of suburbs and displayed as a colour-coded geospatial visualisation (map) using PowerBI.

To facilitate comparisons between datasets, data was expressed as proportions of people or households. Diferent datasets had diferent samples—so, some were reported as a proportion of the entire population, while others were reported as proportions of other denominators, for example, of respondents to the council survey, by suburb.

#### **Findings**

**Community Resilience Data Dashboard** With most datasets analysed by suburb, the geospatial map format shown in Fig. 2.4 was favoured by most workshop participants. One, two or four maps could be shown on the screen so simultaneous comparisons could be made between diferent topics or diferent indicators or datasets about the same broad topic. Ultimately, a data dashboard was generated with an opening interface showing the diferent topics—Social Connection, Financial Wellbeing and so on. Users could click through to datasets on these topics and view data geospatially visualised as maps with other graphical representations

**Fig. 2.4** City of Greater Bendigo Community resilience dashboard layers by suburb

also available on-screen for deeper dives. As examples, social connection by suburb also shows a bar graph by age group. Also, suburbs could be clicked on via the map, for more granular information about age group and other demographics, by suburb.

## **From the Before and After Interviews**

Interviews with partner organisations were held at project start and end. Below, issues raised at each stage are summarised, with some example quotes. Tis serves to highlight the outcomes and process of change for participants.

At the *start of the project*, participants raised three main aspirations: access to data, connecting with data, and building capability. Tese are summarised below, sometimes with illustrative quotes.

**Issues About Data** Temes discussed related to lack of access to useful data, including low granularity, insufciently current data and decline in tailored help from government statistical agencies as their funding has contracted. Partners were frustrated by apparent complete inaccessibility of some datasets (e.g., health data) and hoped the project would help them to fnd ways to access this data or to fnd out why it was so hidden. In terms of their own data, partners sometimes noted feeling overwhelmed; for example, "We have just so much data that's in our systems, but actually being able to pull it out and make sense of it and gain insight and intelligence from it is a continuous challenge" (homelessness service). All were intrigued by the potential to use data more and sought to probe the benefts and boundaries of data re-use.

**Connecting with Data** Participants saw beyond the immediate challenges and thought working together with data could be a catalyst for bringing organisations together for community beneft: "For the health services and other providers as part of the co-op, it might just actually make a diference and be a way we can all collectively advocate for a more interconnected service system. We know at the moment there's a lot of wasted time and efort and money for the service providers, but also the clients who just get shunted from one place to another" (homelessness service).

**Building Capability** Generating data capability for individuals, organisations and the community was mentioned by most participants: "It's actually growing some capacity in our region to use data together" (women's health service); "So our organisation would have capacity in terms of well, how to design data sets for instance, so that they are analysable" (homelessness service).

By the *end of the project*, partner participants reported feeling more confdent and empowered about using data. While they noted insights gained about their community from data analyses, their main refections were about gains in data capability and collaborative relationships.

**Insights About Community** Participants noted their preconceptions about more-or-less resilient suburbs were not all borne out when actual datasets were analysed. For one suburb not previously identifed as having challenges, data analyses showed consistent defcits, when compared with other suburbs, on multiple resilience indicators. Another suburb perceived as wealthy was suggested—via data analysis—as vulnerable regarding social isolation. Participants noted this made them want to fnd out more about what was happening in these suburbs, that is, to get some ground-truthing for verifcation of the information suggested by the data analyses.

**Capability Built** All participants discussed increases in aspects of data capability. One participant highlighted appreciation of governance matters for using and sharing data, while another had started working with her organisation's data specialist and was working more with data herself. One participant, a data manager at a health organisation, noted the project had made him question his organisation's reluctance to share data: "I've come to question some really tired governance structures. Maybe it's done because we don't understand what's being asked, but really, it's about avoiding the risk. I don't have a solution, but it's become quite obvious" (health service commissioning organisation).

Participants discussed strategies developed to deal with data sharing challenges. For example, making indices to show relative levels of indicators across diferent suburbs. Te power of sophisticated visual displays was highlighted: "I found it really riveting the frst time you guys showed those maps… it was just—I loved it" (community health service No. 2); and "Service managers are often quite visually driven, so it's quite powerful in that sense, the power of the data seeing it displayed" (health service commissioning organisation).

**Connecting with Data** Te project helped to build relationships and understanding between organisations. One said: "I guess I've become more aware of the value of the process, perhaps even more so than the value of the outcome" (Council). Talking about and with data was suggested as useful for building knowledge about each other's work through data. Bank participants said they had increased understanding of community challenges and they were able to introduce this knowledge into other discussions within the bank.

#### **Outcomes and Lessons Learned**

Overall, the project was well received, with participants more enthused at the end than at the start! Participants worked their way through data challenges as they arose, fnding workable solutions. For example, using an index when working with potentially sensitive data to avoid any risk of identifability. On this topic, participants were primarily concerned about reputational risk for their organisation if someone used analysed data out of context as, in all other respects, they were sure they were reusing data safely and ethically.

Contrary to advice to start with an identifed question (Te GovLab, 2022), partners in this project benefted from a period of exploring data with each other. At the start, each had their own interests and did not know the work of other organisations. Signifcantly, they also did not know what data might be forthcoming from their own organisations. Te project was a journey of discovery in many ways and, at the end, participants were more knowledgeable and confdent to agree next steps of work with data as individual organisations and collaboratively.

While the project started with organisations focused on getting new insights from data, from around half-way through the project, partners agreed a diferent signifcant outcome was forthcoming. Tis was building mutual knowledge through exploring data together that enabled them to see what each could contribute to collective change at community level. Further, they felt empowered to use data in their own work and could see where it might support work of the organisation because they could now understand their operations and services through a lens of data. Some commented they had started to work more confdently on data governance issues. For example, the homelessness service identifed gaps in data due to incomplete collection. Managers said they would use new data visualisations to illustrate to staf the beneft of collecting complete datasets.

Data sharing remains problematical. One health organisation simply did not provide data because of perceived challenges of sharing. Te data manager explained it was too difcult and time-consuming to navigate the necessary processes—potentially impossible, he thought. Most encouraging was that some managed to navigate data sharing, helping to generate novel analyses that gave new perspectives about the community.

To read more on the City of Greater Bendigo Data Collaborative see Farmer et al. (2022) and https://datacoop.com.au/bendigo/.

## **Summary**

Above we have provided three case studies of data projects from our research and working with partners. While each is diferent, they all involve collaboration between people and/or organisations with diferent expertise and perspectives. Similarly, in common, the cases each re-used diferent datasets and targeted diferent insights.

Each of the cases provides evidence of learning and changes in relation to using data among staf of the participating organisations. We understand this as *infuencing aspects of the data capability* of the organisations that participated. With Case Studies 2 and 3, we were able to evidence changes through *before and after* the project interview data collected. With Case Study 1, the government Business Insights Unit was able to extend its range of types of analyses to inform policy once it learned new techniques of using social media data and found new data sources. In Case Study 2, each organisation's participants expressed surprise that their routine datasets could be repurposed to address real operational and impact measurement challenges. Case Study 3 yielded several examples of changes in awareness, with a participant of one organisation talking about using data much more in her own work and most of the participants remarked on their increasing and more confdent interactions with their data staf and teams due to their practical and applied learning from the data collaborative project.

Te datasets and analysis techniques varied. While Case Study 1 used innovative Natural Language Processing techniques and public 'big data', linking disparate existing datasets and geospatial analysis was more important for Case Studies 2 and 3. Common to each case was a collaborative process of data discovery, repurposing, linking and *sense-making*. Tat is, each case shows the signifcance of identifying and exploring existing datasets and considering how they can be re-used and linked with open and public data. Equally important is the process of data visualisation and, in each case, this enabled processes of collaborative sensemaking with the data.

In terms of collaboration, Case Study 1 involved participants from different departments and agencies of government involved in generating, implementing and evaluating policy, but also staf of the Business Insights Unit who were already engaged in aspects of data analysis. In Case Study 2, the participants brought together around projects were from across departments within each of the non-proft organisations. Tese staf tended to note that they generally work in isolated departmental silos. Te data project brought them together to discuss how their work interconnects, driven by working with data. In Case Study 3, the collaboration was among diferent organisations working in the same community. Interestingly, for each of these diferent types of collaborations, we noted the same set of emergent phenomena or benefts. Participants got to know and understand each other's work partly through the purposeful action of the process, but also by discussing and probing data generated by the work of diferent participants at the table (or on the Zoom call). Further, new relationships were forged that could lead to more efcient and efective, and certainly better-informed, future working together. As a participant in Case Study 3 noted, she came to understand "the value of the process even more so than the outcome".

Each case raised barriers and challenges that simultaneously helped to ground participants' expectations about the potential of data analytics, but also sent them back to their organisations to question practices or to make change. For example, in Case Study 3, the homelessness organisation wanted to improve the completeness of its data, and the healthcare commission organisation participant wanted to explore governance practices that served to keep health data hidden. In Case Study 1, participants came to understand the value of aligning the outcomes measurement framework with likely available data from the start, rather than trying to tack things together after policy implementation. All participants came to understand the challenges of sharing data between collaborating partner organisations.

## **Key Takeaways from This Chapter**

In this chapter, we jumped straight into some case studies of non-profts and data analytics. Tis was done to ensure that readers know what kind of work we are talking about and to illustrate the range of possibilities for types of datasets to work with, visualisations and participants. Key points to take away from this chapter are listed below.

#### Key Takeaways


Undertaking the case study projects in this chapter with diverse organisational partners led to our conceptualisation of data capability and appreciating the benefts of collaborative working that are explored in Chap. 3.

## **References**


**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

Te images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **3**

## **Data Capability Through Collaborative Data Action**

In Chap. 2, we presented case studies of some of our data projects that involved working with non-profts and other types of organisations and re-using varied datasets. Each of these projects saw participants move from curiosity about data analytics, to a growth in confdence around using terminology, understanding techniques and having a grasp of nonprofts' internal data resources. We argue that this represents the participants making progress in building aspects of the data capability of their organisations as well as understanding gaps. From our experience, successful results happen in data projects when people with diverse backgrounds and perspectives collaborate to explore issues of direct relevance to them, drawing on varied expertise, infrastructure and datasets. Organisations have existing data practices and resources, and so experimenting together with novel analytical techniques and types of datasets can help partners with a social mission to understand what to do next to extend and tailor their future data practices.

What we found through our projects with non-profts, then, is that *collaborative data action* supports the *building of data capability*. As depicted in our case studies, collaborations can draw across teams within a single organisation, across a set of like-minded organisation partners and externally with researcher partners and others. In this chapter, we move from examples showing the sometimes messy business of non-profits working with novel datasets, to attempting to secure some concepts and processes that underpin non-profts working with data analytics. Tus, we explore here what we think data capability looks like for nonprofts and provide our methodology for supporting capability to build through collaborative data action. In doing so, we suggest priority topics for non-profts to address, principally around establishing responsible data governance and being clear about ethics and consent.

Again, we note this is based on our practical work up to 2022, and from our base in Australia. Law and practices relevant to non-proft data analytics will be diferent in other countries and regions and are changing over time.

## **Understanding Data Capability**

Drawing on our own research, we suggest that at an organisational level, *data capability* is a holistic resource. It involves having in place the interconnected aspects of appropriate *staf roles and skills, technologies*, and *data management practices and processes* to fulfl what an organisation needs and wants to do with data. In data science, *capability* has a dual meaning, relating both to human competencies and technical components like software, hardware and database systems. In our work, we retain this sense of data capability as multi-faceted and interconnected with multiple technical and human attributes. Data capability is additionally hard to pin down, we suggest, because it is situated or adaptive to context—that is, data capability will vary according with each non-proft's work, mission and vision in their operating context. We realise this can make data capability seem elusive and hard to measure, but we suggest it is most realistic to think of it as this combined, evolving, overall resource.

Data capability is related to data management and data governance. *Data management* is about having a system of internal practices and mechanisms for controlling data within an organisation. DAMA International describe centralised, distributed and hybrid models of data management, referring to the way parts of an organisation can work collectively and independently when managing and working with data (2017, p. 565). *Data governance* is the framework of ethics, safety and accountability practices that interweaves with and shapes how data management is done. We return to explore data governance as a foundation for data capability later in this chapter.

We suggest *data capability* is the *outcome* that non-profts should be aspiring to achieve as they increasingly use data analytics. However, it is not static, rather it is refreshed and continually reformed via *processes* of engaging with datasets and new ways of working with, and using, data*.* Tis means the data capability of an organisation formulates through adaptation and change via ongoing experimenting and learning with data. Considering our Chap. 2 case study projects as processes of learning, participants were generally more knowledgeable, confdent and comfortable with using data and interpreting data analyses by the end of projects. While we did not have formal evaluation in all our projects, we witnessed instances of increased engagement with data among a wide range of staf members (not just data or IT professionals) and the adoption of more sophisticated data practices, often across teams and individuals who didn't normally work together. Participants developed agility and confdence in their ability to determine when and which types of data analytics and visualisations would be useful (or not) in specifc contexts. Tey were generally more excited and animated about the potential of working with data into the future. Underpinning these fndings, participants also talked about changes that would need to be made, particularly to their data management and data governance practices. Examples of this include questioning risk aversion in sharing datasets and talking about the need for strategic consideration of reconfguring data governance. Tese are all aspects indicating the way data capability forms and provide examples of the multiple and small steps by which data capability develops in relation to context.

In our projects, we saw non-profts' data capability infuenced through processes of practising with using their *own* internal datasets for insights about *their* problems and challenges. Tis seemed impactful, compared with participating in generic training modules or engaging with generic resource kits (as we tried in Case 2 described in Chap. 2). While building data capability still implies fnancial investment in technologies, infrastructures and skilled people, collaborative practice can help participants work out what their organisation needs and target their spending on priorities. Depending on who is involved in collaborative projects, progress in data capability can be activated strategically (from the top down) where senior managers participate, or from the ground up, through the action of practitioners in consumer and client-facing roles.

Responding to sectoral interest in increasing data analytics expertise across the non-proft sector, several frameworks have emerged for measuring and monitoring development of organisational resources related to having data capability (for example, see the work of https:// data.org in the US). Some stakeholders—such as philanthropic foundations or non-proft representative bodies—seek to benchmark how individual non-profts compare in their *data maturity* against others in the sector. Tey also apply frameworks to identify sectoral strengths and gaps. Some assessment tools have rating scales, for example, with a low score for initial or ad hoc practices, to a higher score for systematically *managed* or *optimised* data practices (see, e.g., DAMA International's rating scale [DAMA International, 2017, p. 531]). In the UK, Data Orchard's *Framework for Measuring Data Maturity* in non-proft organisations (Data Orchard, 2019) aims for expert-level resources and practices or *mastery* as the goal, with maturity examined on dimensions including data uses, analysis, leadership, culture, tools and skills. We explored the diference we see between data capability and data maturity or data literacy in Chap. 1, saying why we prefer the idea of data capability as a goal for non-profts. Tis is mainly because we do not think data resources like human skills, technologies and practices should be fxed, but rather adaptive relative to each non-proft's context, strategy, mission, size and so on.

While we express reservations with static frameworks, one of our own collaborative research projects driven by perspectives from multiple Australian non-profts led to the creation of a broad data capability framework (Yao et al., 2021). Tis identifes attributes participating nonprofts considered central to their data work. Tese are assigned to four domains: (1) *access* to quality data; (2) data *skills* and ability; (3) efective *technology* systems, tools and data infrastructure; and (4) responsible data *governance* (see Yao et al., 2021). However, even given this framework, we have found more generally in our work with non-profts that rather than embracing levels of attainment on a fxed scale, many emphasise they have nuanced and varying needs and goals for data use. Consequently, the value of frameworks, for them, was suggested as ofering shorthand checklists against which to refect on organisational strengths and gaps against an indicative industry standard.

Building the more holistic resource of data capability also enables nonprofts to infuence and activate beyond their own operational matters. For larger organisations, this could involve sharing data expertise with other, smaller organisations and helping to develop sector-wide collective responses to social problems. Alternatively, it could involve developing shared data resources or data collaboratives like the *Humanitarian Data Exchange (HDX)* (https://data.humdata.org/). Having data capability provides a foundation for a non-proft to partner with their clients and communities on data projects with wide social beneft. Hendey et al. (2020) depict this as non-profts contributing to a wider social mission of enabling *community data capability*. While no single model of community data capability exists, the authors argue that when data capability and resources are democratised and available to those who can beneft, "communities will be better equipped to partner with foundations, apply data to understand issues, and take the actions needed to achieve the ambitious outcomes that [philanthropic] foundations seek" (Hendey et al., 2020, p. 1). Non-profts are well placed, due to their work and missions, to drive community data capability goals.

## **A Collaborative Data Action Methodology**

Our case studies in Chap. 2 show where we have worked in collaborations with non-profts, sometimes with staf members across teams of one organisation and sometimes across organisations. In those projects, we observed teams and groups addressing a data challenge, but also in the process, developing or at least infuencing their data capability. Some of the impacts of working collaboratively are highlighted at the end of Chap. 2. Observing the projects, their direct outcomes and wider impressive impacts has made us committed to collaborative working; and in this section, we talk specifcally about our collaborative data action methodology.

Tere could be a range of diferent ways that non-profts could gain data capability through collaborative working. Tis could be through working with other non-profts with large or specialist data science teams, working more efectively across teams within their own organisations, or accessing data collaboratives or external *data for social good* initiatives (see this book's appendix). Te point is to engage with others with a shared social mission and to gather a team of people that combines useful knowledge, skills and perspectives.

Tere are some very practical implications of collaborating that we have already alluded to. Tese include accessing others' expertise and resources to help improve your own organisation's access to costly resources and to learn what you need by efcient contextualised learning. Tere are also wider benefts of collaborating. Firstly, the feld of data analytics is moving so fast at present that it requires dedicated specialists to keep up. Tis is just data science, of course, and the felds of social justice and addressing a social mission have also changed dramatically in response to the pandemic and its ongoing efects. A simple beneft of collaborating is that it gives access to a wider range of human resources to keep up with changes in knowledge and techniques across felds of expertise and practice. Collaborating is also a way to help keep small, potentially niche non-profts operating as the sector becomes more corporate and favours larger organisations. Finally, and importantly, organisations collaborating with data for social good help to build the feld. Working together generates new networks, social capital and communities of practice between organisations that will impact more widely to foster community data capability.

In our projects, we use a process of collective 'learning by doing' or *collaborative data action.* Te process allows for experimentation and adaptation. It allows individuals within non-profts, including senior managers and board members, to see how working with data can help to integrate their operations and services across departments (i.e., wider benefts). And it can help to empower and activate grass-roots practitioners in incorporating data work as part of their daily practice.

While data projects will vary in their precise process due to diferent participants, questions, data and timelines, we have found there are a consistent set of main activities that punctuate collaborative data action in our data projects with non-profts. Figure 3.1 outlines these main activities, giving an approximate chronology.

At this point, we highlight that we have mainly used the collaborative data action methodology when working with organisations seeking to fnd out whether data analytics is useful for them. Tis could suggest it works best for those setting out from *a low base*; however, that is not the whole story. For example, the bank in Case Study 3 had a large and sophisticated data analytics team, and in Case Study 1, we worked with the business insights unit of government, a team specialised in data analytics to inform policy. Rather, then, perhaps the collaborative data action methodology is best regarded as a mechanism for experimenting with data analytics. Experimenting can involve starting out, but it can also involve trialling diferent techniques for data analysis or addressing

**Fig. 3.1** Process of collaborative data action for non-profts' data projects

more ambitious goals. Tus, collaborative data action can involve organisations that are skilled-up and advanced in working with data. Of course, a key element here is that an organisation can access a range of knowledge, technology or other resources that can help to work with data in diferent ways or inject other types of knowledge (e.g., from social science or community practice) into data analytics.

In our projects, we tried out various activities as part of processes of experimenting and collaborating in data projects. Some approaches we initially included turned out to be blind alleys—for example, the general educational webinars we provided in Case Study 2 turned out to be less well-received than learning by doing experienced with participants in addressing their organisations' challenges and using their data. Ultimately, we arrived at a methodology comprising a relatively consistent set of activities that helped to produce project outputs and processes and within which participants said they experienced learning and enjoyment.

Steps in our collaborative data action methodology involve diferent kinds of actions (see Table 3.1). Some steps involve *exploring.* Step 1, for example, is about simultaneously exploring ideas from previous case studies, questions to focus on, and useful datasets all in order to test the feasibility of undertaking a data project and deciding its initial scope.

Step 2 involves turning to specialist experts examples, and precedent for help to formally get started. If a project is being undertaken internally and involves just one organisation, then a data protocol should be drawn up establishing what is to be done with data and why. If a project involves collaborating and sharing data across organisations, then data sharing agreements will be required that allow partners to work together with internal datasets. Data sharing is notoriously complex and requires engaging with legal principles infuenced by the laws and guidance that apply in diferent geographical jurisdictions. Individual organisations will also have their own protocols and require compliance with sectoral guidance. We have indicated some current resources that can help to think about data sharing and what is required in data sharing agreements in the appendix. Data sharing across organisations is also revisited later in this chapter.

In our projects we also found that it was useful to build in some formal *stocktake* or evaluation 'before and after' opportunities to facilitate refection at the start and end of data projects. Tis enables participants

**Table 3.1** Steps in the process of collaborative data action for non-profts' data projects


to identify changes in their attitudes and practices at individual and organisational levels. Tis stocktake can be simple and involve thinking about and documenting concerns about data, aspirations for using data and assessments of expertise and readiness. At the end of projects, it can be about what was learned and what remain gaps. Stocktakes are at steps 3 and 5 of our methodology. We did not include formal data gathering stocktakes in our early projects (e.g., Case Study 1), but we discovered its value in Case Study 2 and then applied this learning in Case Study 3 and other projects since.

Step 4 involves *iteration* of several activities of working with datasets, aiming to answer questions and point to next steps. It involves analysing and visualising data and then exploring and discussing results. Once analyses and visualisations have been explored, it is usually necessary to cycle back a few times to identify other useful datasets and analyse and visualise these—all with the target of getting closer to an 'answer' to questions set or topics to be explored via the data analyses and to fnd out more about the topic(s) involved in exploring a question.

In our projects we employed cycles of workshops using an approach inspired by the data walks method of the Urban Institute's National Neighborhood Indicators Partnerships (Murray et al., 2015). Data walks involve workshop discussion where participants are shown visualised analyses, and encouraged to ask questions, engage with what *they see* in the data and sense-check this given their grass-roots knowledge. Iterative rounds of data analysis followed by discussion help participants to make sense of data that has been analysed and visualised and to discuss with each other, the stories they perceive to be told in the data. Visualisations are an important part of data walks, as diagrams, geospatial maps and graphs tend to be commonly accessible to participants from diferent backgrounds. In our projects, data walks were useful for considering topic-based insights but also for stimulating technical queries about datasets and exploring issues about data collection afecting interpretation of analyses.

Based on feedback on analysed and visualised data from the workshops, new datasets may be identifed and analysed, new types of analysis might be conducted with the same datasets or diferent visualisation techniques might be employed. Ten new analyses and visualisations would be brought back for further discussion and sense-making at a workshop, with the idea being to cycle through multiple workshops until a question or focus topic has been sufciently addressed. Open-ended cycles of iteration can be challenging to explain in funding applications and contracts, so it may be useful to consider that in our projects we found three to four iterative cycles generally produced useful fndings. After more than three to four cycles, the project might lose impetus and participants might lose interest.

Exploring questions and datasets collaboratively in workshops helps to generate a shared understanding and language around data use and outcomes sought. Te collaborative methodology ensures that each participant shares their perspective in these sessions and their take on featured questions and data. Tis means that no single department within an organisation or dominant partner, if working across organisations, imposes their viewpoint. Taking an exploratory approach can generate wider buyin by showing that diferent participants can have diferent, equally valid, ways of understanding a question, problem or challenge being addressed. Understanding can be gained here about how problems are multi-faceted, prompted by discussing insights suggested by data analyses.

Tis working between question(s) and dataset(s) that we describe involves processes of *adaptation*, with a goal of matching data with questions. Sometimes the adaptive process leads to framing a question in a diferent way. At other times, there is a realisation that a whole and perfect dataset to answer a pre-defned question does not exist, prompting a turn to other data that can *inform* about a question if not answer it directly. An example here was where the state government participants in Case Study 1 came to realise that a comprehensive dataset precisely aligning with changed attitudes to family violence did not exist. Instead, we harnessed Twitter data and news media data with textual data analytics to show a quite granular change in topics discussed over time. At the same time, we know there are caveats about some of these datasets. For example, Twitter users are a self-selecting, more policy-aware community. Te government itself periodically conducts a Community Attitudes Survey covering attitudes to family violence but, again, responses in that dataset are from self-selected participants who tend to be older and more educated. Together, the data from the three sources (Twitter, news media, community survey) can be *triangulated* to give richer, though still not comprehensive, information about the extent of discussion (in this case related to family violence), variety of topics discussed and responses to diferent types of policy and other events.

Te adaptive way of working between topics and questions that we adopt is one way that our approach is potentially distinct. Other data project methodologies we have seen emphasise pursuing and identifying *a precise problem or question* before proceeding to data analysis (e.g., Te GovLab, n.d.). While it is important to have a broad initial focus, we have found it can be difcult for non-proft partners to identify specifc questions or *pain points* at the start of a data project. Tis can be because participants don't have a grasp of what data might be available, what might be possible (and not possible) with data analytics and may need time to understand the work of other participants. In our experience, focus for projects does happen, but it emerges or sharpens through working with data and discussing questions iteratively and learning what is possible and useful. Being open as to focus can be challenging for nonprofts to justify in funding applications, so a useful strategy is to identify a broad topic to explore from the start.

Following the end of project stocktake at step 5, the conclusion of the process is to acknowledge what has been achieved in terms of data product outputs and wider outcomes in relation to learning or partnerships and to decide what next steps are appropriate, if any.

## **Finding Your Data Collaborators**

In this book, we propose that building data capability should not be a solo practice. Building data capability could be done through working on experimental data projects and these might beneft, depending on their scope and goals, from the skills and perspectives of a range of diferent people, teams and organisations. Preferably, this would also include lived experience consumers, clients and citizens because they will help to make more insightful, ethical data products and extend data capability within the community. In Chap. 2, we showed that the collaborations we have worked within took multiple forms. Tey involved working across departments *inside* an organisation (as with Good Cycles and Yooralla in Case Study 2, and multiple departments and agencies of government in Case Study 1) and working *across* non-profts and other community organisations (as in the City of Greater Bendigo data collaborative project in Case Study 3). In each case, our university-based social data analytics team brought expertise in data science and social science, as well as access to technologies and safe, secure practices. Te collaborating partners brought their expertise which also involved data analytics skills and understanding of problems and contexts. When we were re-using nonprofts' internal datasets, their staf could inform about how data was collected and what was included and excluded in datasets.

We term the various participants—people, teams, organisations—in data projects as data collaborators. While a range of perspectives makes the collaboration more than the sum of its parts, clearly the main thing we are focused on is the potential ofered by injecting advanced knowhow about data science and analytics. It is a premise of this book that the projects we describe are about building (greater) data capability for nonprofts. In our projects, the university team brought access to advanced data science knowledge, technology and practices. While here we mainly focus on university teams, there is a range of ways to access collaborating partners with data science expertise. Non-profts might partner with other, perhaps larger, non-profts that have specialist data analytics teams or collaborate together to approach some external entity with expertise. In the appendix, we suggest some data analytics initiatives that have a particular mission to build data analytics capability of the non-proft sector. Initiatives working to support data capability development are sometimes termed *data intermediaries* or *data institutions* (Hardinges & Keller, 2022). Tese might ofer opportunities for mentoring and learning in partnerships (Perkmann & Schildt, 2014; Susha et al., 2017), although some data intermediaries are more engaged as *brokers* between organisations and data owners (Sangwan, 2021). In encouraging collaborations between non-profts and other social sector actors to grow data capability and community data capability, we align with the concept of the organisational partners envisaged in the National Neighborhood Indicators Partnerships. Many of those partnerships combine local community organisations, non-profts and councils working with university social data analytics labs (Arena & Hendey, 2019).

As university researchers ourselves, we recognise and suggest the potential of seeking out a university social data analytics lab to work with. Te opportunity is that such labs will often share the social mission orientation of non-profts, and there are many examples of labs situated in universities around the world. Some university data analytics labs will be actively looking to partner for access to 'real-life' projects for training data science students. As one example, the Center for Urban and Regional Afairs (CURA) at the University of Minnesota (https://www.cura.umn. edu) links academics and students with community organisations to generate data analytics projects, specialising in data for neighbourhood planning. Other examples of university data labs working with nonprofts can be found in the literature; for example, Tripp et al. (2020) describe a partnership between an education and literacy non-proft and the West Georgia University's Data and Visualisation Lab. Of course, generally universities do still require funding to work on data projects. Tis could come directly from a non-proft or partnerships could be formed with university labs to apply, together, for funding.

Diferent partners collaborating with data and sharing knowledge and skills generates new *boundary spaces* (Susha et al., 2017). Tese enable novel combined skillsets to emerge, helping to grow a future workforce of people that understand both non-proft work and data analytics. Research literature describing *how to do* data analytics for social good emphasises the signifcance of a diverse team, including data scientists, social scientists, practitioners and lived experience consumers and clients (e.g., Williams, 2020).

## **Responsible Data Governance**

In the last part of this chapter, we focus on practices that all non-profts will already have considered in some way if they are working with data: these are practices of data governance. Data governance is understood here as having the systems and processes so that an organisation can ensure data is managed and analysed responsibly, legally and ethically. It involves having clear mechanisms through which an organisation, and its people, are held to account about the production and use of data. We focus on data governance here because it is a priority consideration for an organisation working to re-use its data. Having appropriate data governance in place is a necessary precursor to working in data projects, particularly when engaging with other organisations in a collaboration. It is also a feature that organisations can start working on without having to wait to fnd data collaborators to work with.

Having responsible data governance enables an organisation to have safe and secure data, accountability, quality assurance and ethical data practice. Active engagement across organisations in data governance will result in a positive data culture, with all staf, clients, consumers, managers and board members engaged in well-considered, ethical data work.

Co-ordinated practices of responsible data governance should be thought through and implemented by any organisation collecting and using data. Data governance sits around, permeates and directs data management, including afecting who works with data (roles and skills), technologies and how they are used, and the nature of practices and processes in handling, storing and analysing data. Governance will need to be able to respond to changing organisation requirements to use diferent datasets with diferent types of analyses. Data governance needs to be integral to organisational governance, not seen as separate, as it relates to whole of organisation best practice and accountability. With increased production, storage and use of data, and the consequent potential for many forms of data harm, data governance has become an important aspect of organisational governance (Redden et al., 2020). Tis includes aligning and interweaving data practices with the protocols and policies that guide an organisation's practices around ethics, risk management, compliance, administration and privacy (Governance Institute of Australia, 2022).

Te signifcance of data governance makes it a strategic organisational issue, and the priority data governance is given by organisations will determine what they can do with data. Te values inherent in how data governance is implemented shapes the goals and outcomes of using data. Tis includes ways of viewing relationships—customers and clients can be 'mined', and their data 'extracted', or they can be consenting collaborators, with their needs aligned to how data is used.

Depictions of data governance in the research literature can suggest a commercial emphasis inappropriate for the non-proft sector. For example, Otto (2011, p. 47) defnes data governance as "a companywide framework for assigning decision-related rights and duties in order to be able to adequately handle data as a company asset" (cited in Alhassan et al., 2018, p. 301). Objectifying data in this way, as a kind of commodity, serves to disregard the integrative relationship between data, people and services. It might be said, therefore, that non-proft data governance models compare, but also difer, in ways from those of commercial organisations, with diferences driven by mission, context and vision of each non-proft.

While frameworks for data governance tend to be internally focused, the requirement for formal policies and protocols is increasingly driven by interactions with the external environment. Tis is especially true in relation to embarking on data collaborations involving other organisations and sharing datasets (Verhulst, 2021). Indeed, increasingly, experts advocate for data stewards as a kind of data governance role for organisations serious about developing data capability (Verhulst et al., 2020). "Data stewardship is a concept with deep roots in the science and practice of data collection, sharing, and analysis. Refecting the values of fair information practice, data stewardship denotes an approach to the management of data, particularly data that can identify individuals" (Rosenbaum, 2010, p. 1442). Data stewards would be responsible for understanding the datasets that exist in organisations and ensuring their quality. One role for organisational data stewards would be in bringing internal datasets into collaborations across organisations to facilitate data collaboratives and data sharing.

While designating a data steward signifes organisational acknowledgement that data governance is important and demands an owner, the holistic nature of data governance suggests it as also collective action issue. As touched on in Chap. 1, clients, customers and other people in the data of non-profts and involved in its collection, should be included in designing data governance that assures fairness and empowerment. Some researchers have demonstrated "the value in theorizing data governance as a collective action problem and argue for the necessity of ensuring researchers and practitioners achieve a common understanding of the inherent challenges, as a frst step towards developing data governance solutions that are viable in practice" (Benfeldt et al., 2020, p. 299).

Topics at the heart of responsible data governance are ethics and consent and are featured below. Clarity about ethics and relationships of consent and trust is essential because of the imperative of accountability to all of the people who are stakeholders in the data. Getting ethics and consent right sets non-profts up to achieve in more ambitious, innovative and strategic eforts of working with data beyond basic use of internal datasets—that is, looking to data collaboratives and data sharing.

Data culture is closely related to data governance. When data governance is working well, it becomes embedded and part of the everyday practice of organisations, contributing to a positive data culture. Clearly data culture can be of varying quality, dependent on attributes such as inclusion in governance, ethics-orientation and embeddedness in roles, operations and strategy. We understand data culture here as the organisationally embedded ways of understanding and working with data ethically and safely. Central to having a positive data culture is instilling and embedding genuine concern about the relationship between the people who generate the data (bearing in mind Williams' assertion that "data are people" [Williams, 2020, p. 220]) and what can thus be done with data. Disciplined thinking about consent and trust must be established and maintained. Data culture relates to the values of organisations around enabling and empowering people (staf, clients, customers and others) and accountability to these stakeholders. While we found little written about organisational data culture and its development, it seems an issue that is close to consideration of organisational ethics.

## **Data Ethics and Consent**

Issues of ethics and consent are fundamental to consider from the start of any data collection. Tey are difcult to 'retroft' if a non-proft decides it wants to re-use data originally collected to measure outputs or for statutory reporting. Clearly as well, addressing these issues is not about organising so that a non-proft can have the data it wants to work with. Te question of who owns the data, and is *in* the data, is the ethical issue here. As highlighted in Chap. 1, work is ongoing internationally to partner with people who are (in) data to drive its ethical collection and use. Indigenous scholars have perhaps gone furthest in showing why and how marginalised groups should be driving collection and use of data about them. For example, Kukutai and Taylor (2016) documented the importance of afrming Indigenous people's rights to self-determination via recognition of data sovereignty.

Some practical guidance and resources to help non-profts achieve ethical data use and re-use have been developed by data initiatives internationally (e.g., National Neighborhood Indicators Partnership, 2018; NESTA, 2022; and see the appendix). In our own work in collaborations with non-profts, we have found that some materials about ethics and consent can be high-level, too general or too specifc in their nature for application across diverse contexts. As a body of advice, the sheer amount of guidance can even seem overwhelming. Perhaps because of this, among the communities of data practice where we have participated, non-profts tend to share and adapt data management, privacy and security policies among their networks and to develop norms around data collection and use through cumulative processes. Data ethics is not always explicitly discussed, even if care and responsibility is taken in all data practices. Here, we suggest how to begin to think about and apply data ethics, irrespective of precise frameworks or protocols, by focusing on establishing relationships of care and consent in data production and use.

Firstly, there are legal considerations in using personal data and data governance is entwined with regulation and increasingly the subject of law reform across diferent global jurisdictions. Laws governing personal data have dealt mainly with issues of privacy and cybersecurity but are becoming more complicated as technology develops and services become 'digital-frst'. Because these are jurisdiction-specifc, all we can suggest here is to consult jurisdictional sector representative bodies and the government agencies established to guide and inform adherence to relevant laws. If working with *sensitive data*—for example, personal data, especially where it concerns health, race, sexuality, beliefs and associations—data ethics and data management practices (like secure or encrypted storage, de-identifcation and access protocols) are high priority. Non-profts should consider working with a legal advisor with relevant understanding of data, information and privacy regulation.

Beyond compliance with relevant data regulation, there is growing recognition of the need to begin with ethical frameworks and develop policies and practices for data use that involve carefully established trust and consent. By consent we do not simply mean the kinds of contractual agreement documents or pages that people sign or click 'OK' to engage with a service. Tese are instruments for establishing consent, but we are referring more broadly to the relationships developed within an organisation and with customers, clients and citizens around data collection and use.

Gaining consent for data use is a *process* for ensuring good data practices and relationships. It does not happen just once but is maintained and re-established as part of managing client and customer relationships and ensuring informed agreement with any new use of data. Tis is often approached through the establishment of norms (based on an organisations' values) of what an organisation *should* do to work safely with personal data, and with *care*. Two useful guiding principles are that any data collected should be necessary, and the purpose should be transparent and communicated clearly to those involved in generating the data or to whom it refers. Tis requires deciding what data is to be collected and its purpose, and an organisation may have detailed policy documents and ethical frameworks to help guide those decisions. As raised in Chap. 1, non-profts should be working towards involving consumers or clients (i.e., often the subjects in and of non-profts internal datasets), in codesigning these practices, avoiding tokenistic forms of inclusion.

As part of data governance, a comprehensive set of data ethics protocols and policies can help to drive a positive organisational data culture. With data collection increasing, data ethics scholars have identifed core concerns to be addressed. Mittelstadt and Floridi (2016) emphasise informed consent, privacy (including data anonymisation and data protection), ownership and control over data, epistemology and objectivity (or data quality), and data-driven inequality "between those who have or lack the necessary resources to analyse increasingly large datasets" (Mittelstadt & Floridi, 2016, p. 303). Franzke et al. (2021) describe the development of a Data Ethics Decision Aid (DEDA), used to refect on and guide decisions about data projects in the governmental context. Te Open Data Institute's (2019) Data Ethics Canvas identifes 14 categories to help assess ethical aspects of using data in an organisational or government context.

Tere are increasing moves for organisations to collaborate to share reused data generated through their work. Our City of Greater Bendigo data collaborative (see Case Study 3 in Chap. 2), for example, was developed because seven community organisations wanted to fnd out whether pooling their data could help to generate new insights about community resilience. Tere are important ethical dimensions to such data re-use in the context of data sharing. Tere are logistical aspects to data sharing—why do it, what data and for what kinds of analysis? But data sharing and re-use are underpinned by governance and ethical issues frst, because data use is contingent on the arrangements in place to ensure data is treated ethically, safely and with care. Foremost is clarity about whether consent for diferent types of use has been established or needs to be (re-)established with those who are the subjects of the data. Consent might have been established for a primary purpose but not for a secondary purpose. In Europe, the General Data Protection Regulation (GDPR) laws restrict data re-use and suggest re-establishing consent for secondary use (European Parliament and the Council of the European Union, 2016). In that jurisdiction, data can be re-used for a secondary purpose if its use relates to the primary purpose and a person would reasonably expect it to be used for the secondary purpose. For health information or other sensitive information, re-use is contingent on a direct link with the primary purpose for data collection.

Ensuring that ethics and consent issues are well considered, clear and codifed, and comply with jurisdictional data legislation and practice is signifcant to guiding a non-proft's internal use of data. Tis becomes crucial when starting to work with other organisations to re-use data in collaborations. Ethics and consent practice govern the extent to which analyses of a non-proft's internal data can be undertaken, shown or shared with other organisations. While this might sound straightforward, consider what is potentially hidden in that deceptively simple idea of showing or sharing. In our City of Greater Bendigo Case Study 3 (see Chap. 2), it was one thing to look at each organisations' visualised data analyses in a workshop of seven organisations' representatives, but we then had to work out whether the visualisations could be seen by other staf or even explored in wider community engagement exercises. If visualised analyses of data could be shared, then in what formats? For example, ultimately percentages at suburb level were converted into an index of high to low relative quantities (e.g., in relation to wealth or demand for types of services) in our visualisations. Tis meant these could be shared beyond immediate workshop participants. Tis decision was taken on the basis of adhering to consents given/obtained for each dataset. Te decision also responded to perceived potential reputational risks where community members might react adversely to seeing visualisations of datasets, for example, bank or service demand data, even if completely unidentifable to individuals or households.

## **Data Sharing for Collective Gain**

Given the issues just raised about data sharing in the example of Case Study 3, fnally in this chapter we focus specifcally on the data governance issue of consent and secondary use of datasets and data sharing. Because an organisation might want to move beyond re-using their own internal data and collaborate with others around data, obtaining appropriate consent is fundamental to data collection. A broad framework of thinking that we have used to guide our projects is the *Five Safes* model, initially developed by the UK Data Service (2017) to enable researchers to access government and sensitive data. Tis model was later adopted by the Australian Ofce of the National Data Commissioner as principles for access to and re-use of public sector data while maintaining data privacy and security. Tough developed for public data sharing, the principles of the Five Safes are equally applicable as a guide to safe data sharing in the non-proft sector. It helps as a high-level framework to evaluate major risk areas and to identify steps to minimise the risk of data re-use. Te Five Safes model draws attention to issues of sharing data in the domains of:


Data collaboratives have become more widely discussed, as organisations recognise the value of working together to address community challenges. In our case studies, we showed an example of a community data collaborative where a range of organisations united around their internal datasets to explore for insights about community resilience. Our data collaborative projects use our Data Co-op platform (https://datacoop. com.au) that has software, hardware, management practices, multidisciplinary skills and data governance to support safe data sharing. Funded to the tune of over AU\$1,000,000 by the Australian Research Council and fve universities, this scale of investment in data collaborative infrastructure is outside the scope of most non-profts. We propose this supports our suggestions above that non-profts seeking to develop more ambitious data analytics projects could usefully collaborate to achieve more ambitious and complex projects.

Data collaborations can have various forms and work together for different reasons (Susha et al., 2017). Verhulst and Sangokoya (2015) give an example of humanitarian organisations working to share data for disaster relief. NCEL, Nepal's largest mobile operator, shared anonymised mobile phone data with the non-proft Swedish organisation Flowminder. With this data, Flowminder mapped where and how people moved in the wake of the disaster and shared this information with the government and UN agencies to assist their relief eforts. Te Data Collaborative between NCEL and Flowminder allowed humanitarian organisations to better target aid to afected communities—saving many lives. While there is great potential and promise for data sharing, Verhulst (2021) highlighted that collaborating with data is one of the main challenges that (big) data initiatives for public good currently face.

As part of the appendix, we highlight some examples of resources and tools about data sharing that could be used by non-profts to fnd more information and examples, including example data sharing agreements.

## **Key Takeaways from This Chapter**

In this chapter, we aimed to move beyond a rationale for non-profts getting involved in data analytics (Chap. 1) and illustrating how this can be done (Chap. 2). We explored data capability, a collaborative data action methodology, data governance, ethics and consent. Te key points to take away from this chapter are presented below.

#### Key Takeaways


Te next and last chapter refects on overall learnings, gives practical advice about starting or proceeding, and looks to the future and its challenges and possibilities.

## **References**


2022, from https://www.urban.org/research/publication/data-walks-inno vative-way-share-data-communities


**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

Te images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **4**

## **Activating for a Data-Capable Future**

So far in this book, we have argued for non-profts building their capability for working with data. We have presented a range of small, practical data projects with non-profts undertaken through our research in 2017–2022. Tese supported participating non-profts to build aspects of their data capability by helping leaders and staf to consider the skills, technologies and management practices that would be needed to match their diferent missions and contexts. We used a *collaborative data action methodology* that draws on diverse skills and experiences within and across organisations, enabling people to learn in practical situations. Projects generated new insights about social challenges, communities and the value of internal organisational data. Tis made collaborating with data a journey of surprises and creativity as well as a journey of learning.

In this fnal chapter, we return to our initial idea of giving a rationale for data capability in the non-proft sector, suggesting benefts and stages. In the middle, we give some activities to 'take to your manager' to get started, and thereafter to move beyond an initial data project. We also suggest some strategic actions at organisation, sector and funder levels that would help to make data analytics part of a new 'business as usual'. Te latter part looks to the future and considers how emergent data initiatives could address current challenges, drawing on some illustrative examples. We conclude by refecting on our learnings from the research and suggest areas for further studies. Te content seeks to stimulate but also to reassure. We think achieving high-quality data analytics work targeted at social good is a viable prospect for non-profts; but more than that, we propose it is an essential underpinning for a bright future.

## **Sectoral Benefts of Non-profts with Data Capability**

Troughout this book, we have made various claims for benefts at the *micro*- (individual organisation) through to the *macro-scale* (community, society and sectoral structures) for non-profts building data capability. In this chapter, though, one of our aims is to provide practical material to 'take to your manager' or board. As a frst step, we summarise three reasons why non-profts should invest in building data capability: to up-skill for increased organisational competence; to build a more resilient, interconnected non-proft 'feld'; and to enable new forms of social justice activism.

## **Data Capability and Organisational Competence**

Let's frst check-in on the contention that data capability is a key building block for non-proft organisational competence and agility in the current global environment. Sian Baker, co-Chief Executive of Data Orchard, a UK-based social business, recently stated that many of her consultancy's clients reported that having internal data capability was an essential enabler of their response during the COVID pandemic (Vaux, 2021). For example, UK-based housing service EMH Group was able to rapidly identify their tenants most in need of welfare checks, thanks to a recently enhanced internal database, and the Herefordshire Food Poverty Alliance (UK) used the fndings of a 2019 food security risk audit to rapidly provide support to clients in 2020. More widely, there is increasing recognition that government and non-profts need to be able to efectively manage data in order to respond to ongoing social disruptions and disasters caused by public health challenges, climate change and military confict in our new age of permanent crisis (Social Ventures Australia and the Centre for Social Impact, 2021; Riboldi et al., 2022). In particular, non-profts need to know what data they have, what data they lack, and how their staf can work ethically and efectively with data.

#### **Data Capability and Field-Building**

Acknowledging there are wider gains to be had, Riboldi et al.'s (2022) report, capturing post-pandemic Australian non-proft leaders' views, showed a clear consensus for a move away from charismatic and hierarchical leadership practices, towards community engaged, collaborative decision-making. Leaders refected on the near impossibility of building new partnerships during the COVID-19 crisis, pointing to the signifcance of being able to leverage "pre-existing relationships, data and insights" when reaching out to government agencies for funding and support (Riboldi et al., 2022, p. 97). Collective working has long been urged for the non-proft sector (Austin and Seitanidi, 2012; Butcher, 2014). Working with data can be a driver and underpinning structure for nonproft collaborations. In our projects we have shown multiple ways and levels that data projects work to build collaborations (see Chaps. 2 and 3).

Working collaboratively to harness and activate data resources can help to build preparedness and resilience for crises by generating good quality data pools. It can draw stakeholders together to learn how to work with each other and to build social capital. Discussing the idea of *feld-building*, McLeod Grant et al. (2020) note that non-profts need to collaborate so that bigger and stronger organisations can support smaller and niche non-profts. Tis will help to keep the sector diverse and able to meet nuanced needs of diferent groups and contexts. Resolving social challenges needs a range of organisations to work together as no single organisation can resolve complex social challenges. Te feld needs to join forces on infrastructure and capabilities so it can aford to do the formidable job it needs to achieve (McLeod Grant et al., 2020). Collaborating with data can be a catalyst and enabler for wider collaboration.

### **Data Capability and Social Justice Activism**

We also want to acknowledge and promote the potential of data analytics for social good as social justice activism. Tis takes non-profts' data work into a space beyond using it to resolve their own operational challenges. It seeks data work that positively spills over into activating social change in the community (Maddison & Scalmer, 2006). In this sense, nonprofts could *apply* their data capability, access to multiple datasets and knowledge generated from analysing datasets. Tey could direct these resources to advocate for marginalised people within social policy processes and to enable citizens themselves to be active with data, through spreading digital and data skills. Here, we are saying that by engaging citizens to work with data, non-profts can empower them with data skills, and with access to new knowledge assets about their communities. Data for social good as activism aligns with Williams' (2020) depiction of social data projects as *data action*. She explains activism as being about inclusion of diverse participants, including citizens, tackling social challenges using diferent datasets and about ground-truthing with grassroots perspectives. Wells (2020) also highlights the credentials of data for good as social activism, saying "data for good means data for all, prioritizing equity, supporting local leaders, and questioning power dynamics, with ethics as a top priority" (para. 1).

Involving the wider community is crucial to avoid repeating past mistakes involving abuses of data that have led to risk aversion and fear. Making active steps to engage citizens is signifcant in shifting power dynamics. Here, we draw on distinctions made by community informatics researcher Michael Gurstein (2011), for example, who argued that making data openly available (as in open data initiatives) has tended to merely hand data assets to those already powerful through controlling and running systems. Gurstein pointed out that active steps to engage beyond managers and leaders are vital for empowering marginalised or disadvantaged groups. Similarly, Kitchin (2013) highlighted that money spent on generating accessible re-used data resources is money not spent directly on supporting marginalised citizens. Consequently, access to data must be democratised and citizens actively empowered to engage with data and inform its application. If not, increased forays into data analytics by non-profts might be seen as representing a diversion of scarce resources to bolster power among those who already enjoy it.

#### **Three Stages of Non-profts' Data Capability**

Building data capability, then, is signifcant to non-profts' business competency, feld-building and supporting social change. At its most basic, participating in a data project using collaborative data action can be pitched to leaders as an *efcient learning programme* about working with data. It is signifcant that non-profts should be skilled and knowledgeable about working with data as the sector comes under increasing pressure from funders seeking accountability and from technology corporates and data social businesses seeking market share. Salesforce, for example, a US software company specialising in customer relationship management software, has a suite of products specially for the non-proft sector (Moltzau, 2019). Googling nonproft data analytics produces multiple pages of blogs and news ephemera generated by businesses aiming to persuade non-profts to engage with *their* data products and services. Te non-proft sector needs data capability so it does not end up in thrall to Big Tech. Non-profts need know-how so they can be discerning about what is ofered and able to ask questions to probe the 'black box' of commercial data products and systems. On the other hand, non-profts need data capability so they can collaborate as a feld with government and philanthropic foundation procurers about sensible data generation and reporting.

Given that it could be difcult to convince non-proft leaders, board members or staf to divert resources to building internal data capability, we do not recommend every organisation to jump straight into complex arrangements, like participating in a data collaborative. Nor do we suggest that every non-proft should seek access to open or commercial datasets or undertake deep dives into sensitive data. Instead, building capability could take an incremental, staged approach:

**Stage One: Build Organisational Data Capability** Te individual nonproft organisation builds of its existing data skills, practices and technologies and uses these resources as a launchpad to develop and improve.

**Stage Two: Build Sector Data Capability** Extending out from internal capability, the organisation engages in data collaborations with others in the non-proft sector. Leaders and staf seek out like-minded collaborators who are interested in similar topics and questions and who hold useful resources.

**Stage Tree: Build Community Data Capability** Clients, consumers and citizens are engaged to work in equitable partnerships with data. Beyond the non-proft and achieving its operational work in better ways, this stage gives potential to actively extend data capability to the community.

## **Data Analytics as Business as Usual**

In Chaps. 2 and 3, we focused on data projects. However, that doesn't show how data analytics can become embedded as part of a new kind of 'business as usual' for non-profts. It doesn't consider *what happens before* and *leading up to* a data project—or what happens *after*. Here, we cover those phases. Looking frst at preparing for a data project and then suggesting activities for proceeding after an initial data project has been undertaken.

## **Getting Started**

In our projects, it has sometimes taken multiple discussions before organisations commit to participating in a data project. Where organisations have been quicker to commit, this tends to be facilitated by interactions with one or more enthusiastic organisational *champions.* Tese participants also often help by pulling together other interested staf and leaders. Undertaking our data projects has given some pointers about what could help a staf member seeking to take this book to their manager to argue for their organisation 'getting into data analytics', perhaps by engaging in a data project. Below are some of those pointers.

**See Data Projects as a Way to Learn About (Your) Data** Doing a small data project gives non-profts' staf and leaders the opportunity to experiment with data. It allows for dialogue and collaboration with colleagues within an organisation through a novel opportunity to test the creative potential of their own organisation's datasets.

When undertaking practical data projects with non-profts, we tended to fnd similar concerns at the start. Many of our participants recognised that their organisations had lots of data and that they should or could be doing something with it. However, participants didn't clearly understand what data they had, what data they lacked—and how they might ask questions and answer them with data. Doing a data project, using a collaborative data action methodology, can address these issues through engaging colleagues collaboratively with their data and their own organisation's challenges.

Te key benefts for organisations working on practical data projects (such as those in Chap. 2) were that participants learned new hands-on skills for *working with* specifc software programmes, statistical models or modes of data visualisation. Much of that learning was about realising they didn't need to become data scientists. Rather, they learned new languages and practices that enabled them to cooperate across silos and specialisms to understand the value of data in their own organisational contexts. Tis, in turn, allowed participants to assess what was required in their organisation to realise the kind of data capability they needed to build. By involving a range of staf including managers and frontline workers, there was scope for learning about interactions between data and the roles of diferent staf members, including understanding the benefts of collecting complete datasets and of being clear around consent to use and re-use data.

**Identify Internal Data Champions and Collaborators** Leadership is a key aspect of a data project. Tose seeking to do a data project should make early moves to identify senior organisation champions who can drive it. Tese people will be the connectors with internal teams as well as working with any external *data collaborators* (i.e., partners that you may have in other organisations). Tis champion role involves organising meetings and co-ordinating data protocols or brokering any necessary agreements with external data collaborators (including agreements to identify and share data, as discussed in Chap. 3). Te role should not be delegated to junior staf unless they have sufcient authority (and time) to undertake these tasks across the duration of the project. While data champions have a lead role, it is signifcant to have a range of staf involved in data projects. Frontline workers, in particular, will have knowledge of clients and community needs and the ways in which it is feasible to collect and use data.

**Identify External Data Collaborators and Resources** Tese data collaborators may be brought together to form the kind of multi-skilled and multi-resourced data analytics teams described in our projects. In Chap. 3 and the appendix, we outlined various policy institutes, university data labs and other types of institutions with experience in *data for social good* projects, and perhaps with access to technology and skilled staf resources. Tese might act as skilled data collaborators, but a non-proft can also work with other non-profts or other organisations with aligned mission and access to useful skills, resources and perspectives.

**Identify Funding** Undertaking a data project takes time, commitment and material resources. Whether a non-proft is keen to build internal data capability or collaborate with data scientists and social scientists as in our projects, sufcient funding is essential to ensure that all parties have the time and resources to do the work. Te amount of funding required will vary according to the scale and scope of activities. In the projects outlined in Chap. 2, co-funding was provided by our university, philanthropic organisations, national and state government research funding agencies and our non-proft and other organisation partners. Te senior researchers provided their time as an 'in-kind' contribution, but this practice is not always supported by universities. Other ways to access expertise could be through volunteer data scientists, as in DataKind projects (see Appendix). Other resources are also required in data projects including computers and software. While this may seem obvious at frst glance, we mention these resources because their costs are not always factored into project grant funding applications.

**Be Vigilant About Ethics and Inclusion** Advocates and researchers globally have been promoting data for social good for nearly a decade. But the leaders in this feld (e.g., Williams, 2020) also caution us about the ethical issues associated with data analytics. In Chaps. 1 and 3, we highlighted the importance of having appropriate consent and clarity around what consent is in place before considering what can be done with data. However, there are other concerns embedded even within datasets that should be borne in mind. Expertise in thinking about hidden ethical issues in data should be built into collaborative teams. As Guyan (2022) observes, even the collection of apparently simple demographic data involves decisions around which kinds of data will be collected—for example, regarding gender, sexuality and trans experience. Tese choices have signifcant impacts on who is visible within data and thus how communities, organisations and other phenomena will appear when data is analysed. Decisions based on these data will afect how resources and services are allocated. Similarly, ethical questions should be asked regarding the potential unintended consequences of collecting, collating and communicating with data. As Williams puts it, "data are people" (2020, p. 220). Even wellintentioned data projects can cause harm when they are used to justify surveillance or control of those whose data is analysed within them.

Williams (2020) warns against what she terms 'hubris' in data projects asking: "Why do we often think the data analyst can fnd the right questions to ask without asking those who have in-depth knowledge of the topics we seek to understand?" (p. xvi). As discussed at other points in this book, the centrality of citizens *in* data does suggest that non-profts need to work to include service users in data projects. While there are useful frameworks and approaches to inform this work, including around Indigenous data sovereignty (Carroll et al., 2020) (discussed in Chap. 1), tested methods and approaches for non-profts engaging their clients and consumers with data are a work-in-progress, we suggest. While waiting for ethics and inclusion practices specifcally in relation to this feld to mature, we recommend taking the advice of Williams (2020). She suggests using the best ethics practices currently available and 'interprets' Zook et al.'s (2017) *ten simple rules for responsible big data research* to provide a list of ethical principles for data action projects (Williams, 2020, p. 93).

## **Moving Beyond a Data Project: Next Steps**

Once one or more experimental data projects have been completed, enthusiasm fred up and initial data capability is built—then what comes after? How might an organisation work to embed data analytics into business as usual?

Investing for ongoing working with data could involve a non-proft adding new specialist staf and technologies or it could involve collaborating with other non-profts and others to access specialists and technologies. Either way, this suggests diferent ways of future working need to be considered.

It is increasingly suggested that any organisation, whether building their own team of data specialists or collaborating with others, should designate a *data steward* (Verhulst et al., 2020). Data stewards have a lead role in data governance and hold knowledge about an organisation's datasets, how they were collected and how they can be used. Data stewards can work with other organisations' data stewards if data is to be shared or used in data collaboratives. Tey are signifcant to generating "a richer institutional environment around data" (Hardinges & Keller, 2022, para. 23). Te Open Data Institute further promotes the idea of *data institutions* (Hardinges & Keller, 2022). Tese can help to support those organisations that don't or can't aford to invest in dedicated data teams. Data institutions are advocated to help to "steward data on behalf of others" and to support data analytics (Hardinges & Keller, 2022, para. 1). Tey could take a variety of forms including data collaboratives. Working with a data institution implies the idea of a non-proft contributing to and being part of a type of collective data capability resource.

Our *Data Co-op* platform, which we used to enable the data projects described in Chap. 2, can be understood as a data institution (for other examples, see Appendix). Te platform represents an expensive collective resource of data science skills, technologies and data management practices (https://datacoop.com.au/). As such, a non-proft can collaborate with us to use the platform to drive their data projects and their routine data analytics work *and/or* non-profts can work together to share data in collaborative projects (as in Case Study 3). Our *Data Co-op* is a cloudhosted platform developed by our Social Data Analytics (SoDA) Lab in collaboration with four other Australian Universities and with funding from the Australian Research Council. Te platform enables researchers and collaborating partners to use secure virtual environments to access, connect, geospatially map and explore correlations between variables in datasets. Tese secure data environments provide close integration with Microsoft PowerBI data analytics, enabling advanced visualisation of datasets. Much of the data used in our projects is open public data, such as that of the Australian Bureau of Statistics (ABS), but the platform also has a secure data layer that can hold de-identifed and encrypted datasets from collaborating organisations.

While working with a data institution is a way for non-profts to extend their data capability, access to data institutions is not ubiquitous across the world, at present. Generating further access to data-institutionlike environments, though, is an area where philanthropy could invest to nurture the data for social good movement (Hendey et al., 2020).

Troughout this book, we have argued that building data capability is important for the future of the non-proft sector and supporting social good. However, non-profts are cash-strapped and there are structural barriers to them pooling resources. In this environment, helping to build sectoral non-proft data capability is a prime space for philanthropic foundations seeking to secure the future of social purpose organisations and to promote social innovation. Philanthropy could support a range of small to larger-scale data initiatives that would be impossible for individual non-profts to pursue alone. Tere are already some examples of philanthropy supporting non-profts' data capability internationally. As an example, *data.org* is funded by the Rockefeller Foundation and the MasterCard Centre for Inclusive Growth in the US to "democratize and reimagine data science to tackle society's greatest challenges and improve lives across the globe" (Te Rockefeller Foundation, 2022). In Australia, where we work, this kind of philanthropic investment to build capability in the non-proft sector has tended to happen in small projects (e.g., see Case Study 2, funded by the Melbourne-based Lord Mayor's Charitable Foundation). Part of the challenge is that foundations traditionally tend to target topics or themes rather than capability-building and infrastructure. However, perhaps the pandemic—by shining a spotlight on the value of online services—might spur more action on infrastructure funding by philanthropy as more reports highlight non-profts' technology-related capability gaps (Riboldi et al., 2022; King et al., 2022). Philanthropy could support place-based initiatives among collaborating non-profts like our City of Greater Bendigo Data Collaborative (Case Study 3), and as in the US National Neighborhood Indicators Partnerships (2022), and theme-based initiatives that support organisations to collaborate to tackle social challenges. Non-profts could be supported to work in data collaborations with each other and/or to work with existing or new data institutions.

## **Innovations to Solve Data Challenges**

Te previous chapters have raised technical challenges in progressing data analytics that go beyond simply persuading leaders to get involved. Data sharing, for example, has been raised as perhaps the biggest challenge (Verhulst, 2021). Te tendency of small experimental projects in the feld is also problematical because it raises questions about the scalability of data analytics within the sector. Te good news is that there are rapid changes taking place that are relevant to data for social good. At the same time as generating excitement, the sheer amount of potentially relevant innovation means it is hard to keep up with change. It's also hard to judge what might 'stick'. Here, we share a few examples of emerging innovations to highlight the feld's dynamism and to highlight the need for critical thinking about the many opportunities. It's hard to tell how quickly, if at all, some innovations could afect non-profts' work with data and in some cases, whether the innovations actually are 'for good'.

Addressing the problem of many small projects, DataKind (an international data science volunteering organisation) has recently established a *Centre of Excellence* to build non-profts' data capability. A key pillar of work is termed *Impact Practices* (Porway, 2019). Te idea built from staf of DataKind identifying that many projects they undertake with social services and non-profts are grouped around similar topics or harness similar techniques. With Impact Practices, DataKind aims to compile, make available and form collaborations around data analytics solutions addressing like topics. In this way, rather than each project starting from scratch and working with DataKind to build something new, work in topics can be translated across non-profts targeting the same social challenge. Porway (2019) writes that work is moving from a *project-based model* to a *practicebased model—*featuring portfolios of data science projects by theme. In a blog announcing the new initiative, an example is given of many projects targeting early detection of disease outbreaks. Rather than building multiple small projects, Impact Practices will unite participants to "understand what data is available, and test real prototypes in the feld to understand what's really possible" (Porway, 2019, p. 3).

DataKind's work is dedicated to solving problems of the non-proft sector, and it works internationally, suggesting strong potential for Impact Practices to translate to diferent contexts and sizes of non-profts, potentially widely infuencing non-proft data analytics into the near future.

Tis transferability may be less likely for our next example of innovation, which is targeted at enabling data sharing. As highlighted in Chap. 3, data sharing between organisations is a signifcant challenge due to each having diferent arrangements for consent and privacy. Internationally, there are diferent privacy regulations around secondary use of data varying by country jurisdictions, for example, the EU General Data Protection Regulation (European Parliament and the Council of the European Union, 2016). To address problems of data sharing across government institutions and borders, the UN Committee of Experts on Big Data and Data Science for Ofcial Statistics is running a pilot programme using Privacy Enhancing Technologies (PETs) (Te Economist Science & Technology, 2022). Current work is targeting international trade data sharing between fve countries' national data agencies. PETs help data providers and data users to safely share information by using encryption and privacy protocols that allow someone to produce useful output data without 'seeing' the input data. Tey also ensure that anonymity of data will be protected throughout its lifecycle and that outputs cannot be used to 'reverse engineer' the original data (UN PET Lab, 2022).

Tis technology is exciting, but only recently initiated and occurring between national statistical ofces so innovations developed could take a long time to flter down to become a technology that is routinely accessible to non-profts.

Finally, a concern we raise in various places is citizen involvement. We have noted an imperative to have citizens engaged in data governance and data use, but their inclusion can be hindered by fear of discussing data use and lack of easily useable engagement methods. Elsewhere, we've mentioned citizen data sovereignty initiatives—for example, EU-funded project DECODE (https://decodeproject.eu/what-decode.html) that is experimenting with ways citizens can decide what happens with their data (Monge et al., 2022). And we've also mentioned good practice in Indigenous data sovereignty that can guide work with citizens (Carroll et al., 2020). In some countries internationally—in this case, in Australia, where we work—*consumer data rights* laws have been established, ostensibly to enable citizens to understand their data and to use it for their empowerment. Te Australian Consumer Data Right (CDR) is suggested to give citizens choice and control over the data that businesses hold about them (Australian Government, 2020). It enables people to transfer their data to another business to fnd products and services better tailored to their needs (Australian Government, 2022). Unfortunately, though, as highlighted by Goggin et al. (2019), the driver for this Act is actually to generate new data businesses and the way the Act is explained and promoted is directed at business, with little attention to educating and activating consumers in data literacy. As Goggin et al. (2019) conclude: "In Australia, it is notable that eforts to respond to concern [about consumer data rights] have come, not in the context of an overhaul of privacy laws or digital rights generally, but via eforts, by marketoriented policy bodies …" (p. 12).

Tis is an example of government enthusiasm for data initiatives resulting in the advancement of for-proft data markets in which public data becomes a product that is commercialised by private developers (Bates, 2012). However, it also potentially serves to highlight an opportunity of where non-profts could harness emergent legislation to empower and advocate for consumers. Non-profts need data capability so they can recognise and harness emergent initiatives like consumer data rights legislation and turn them into opportunities to help build citizen data and digital literacy.

Te examples of innovations in this section are used to illustrate the ongoing emerging initiatives that are relevant to non-profts' data analytics. Tey show that current data analytics challenges are likely to be resolved, but it will take time. Tey also raise the issue of how to keep up with the pace of change and the many disciplines and perspectives that infuence it. Tis further supports the value of collaborating with others, if only simply to have a chance to keep up-to-date with a fast-changing feld.

## **Research Refections and Next Steps**

#### **Our Research Refections**

Taking a step back to refect on the research you've done in a feld over several projects and years is an indulgence in a pressurised funding environment. However, it is important to do as it reveals patterns and sometimes surprises. In this case, having promoted the benefts of crossdisciplinary and multi-perspective working throughout this book, the realisation dawned that this also makes the work quite challenging. One thing that has come to the fore in writing this book is the complexity that arises from trying to meld the positionality of diverse participants and researchers. Positionality considers how your identity infuences, and potentially biases, your understanding of and outlook on the context and phenomena you are working with (Bourke, 2014). Having diferent perspectives in a data project often means that participants have varying expectations and over-layer their learning on pre-existing frameworks and knowledge bases. To illustrate how this works even within our writing team, one of us sees non-profts using data analytics as being a contemporary manifestation of community development. Others in our team are working closely with non-profts and supporting them to organise better for using data, giving a perspective very grounded in operational issues; while our data scientist views the non-proft feld as one of intriguing new datasets to which a range of old and emergent analytical techniques can be applied. Acknowledging the positionality challenges even among our writing team has made us realise how difcult it must be to navigate data projects for our multi-disciplinary, multidepartment and multi-organisation practice partners. It makes us think that those that enjoy and thrive in these data projects are likely those who can deal with uncertainty, tolerate or be curious about diferent perspectives and who are prepared to be fexible with their expectations.

A further issue is inherent in this work *as research*. It is *very* practical, and it is highly participative. We have noted in places that it's more like a learning process than research. In terms of defning it as a research approach, it is perhaps most akin to participatory action research (McIntyre, 2007). Te processes are fuid and while punctuated by consistent types of steps and activities, as highlighted in Chap. 3, this can make this work hard to write up as research. And these same issues of not being able to pin down the process nor constrain the timeline precisely can be of-putting for non-profts considering working on data projects. Tey tend to want a defned process, with stipulated timelines and agreed (beforehand) outputs and outcomes. All quite challenging to delineate at the start of the kinds of data projects we outlined in Chap. 2, when you don't know what datasets a non-proft holds or what the consents governing re-use of data might exist.

While these issues about the data projects can make them frustrating and can deter some non-profts from participating, at the same time the challenges are what make the research interesting and exciting. And the need to tolerate fuidity means our partner organisations tend to be a selfselecting group of innovative early adopters, which makes them fun to work with. Tis is a space of social innovation, after all.

Aligned with the idea of our partner non-profts as enthusiastic innovators, we have experienced a remarkable degree of buy-in to projects once organisations commit to starting. An example of this is participants regularly turning up to data workshops over project timescales lasting 6–18 months. Te City of Greater Bendigo data collaborative, for example, continues to meet and discuss data two years after we started. In that project, there is remarkable buy-in—perhaps because the geospatial data visualisations help service providers and businesses to think about the places where they live and work. Participants are able, repeatedly, to bring suggestions as to why phenomena may be 'seen' in the data analyses, help to ground-truth analyses and give suggestions about datasets and topics that could be explored next. Perhaps there is some sense of wonder at the possibility of generating sleek new data products (in their case, a community resilience data dashboard, see https://datacoop.com.au/bendigo/) from previously routine data produced as cross-sectional reports. Tere is some sense of excitement at unleashing a valuable resource from a previously apparently passive and dull set of spreadsheets.

#### **What Next in Research?**

Turning to what next, some topics emerge as obvious targets for research. Bearing in mind this feld is about the nexus between non-profts, their work and mission, and data analytics, and not about other data-related felds like computational techniques or data law. Tose areas, no doubt, have many research opportunities of their own, but we won't talk about those here.

We think the most signifcant issue is around working with citizens, consumers, clients and the community. Feasible, easily applied methods for doing this—with and for non-profts—need to be developed and tested and to become industry standards. Non-profts need to build their data capability, so they are confdent and skilled in data to engage with consumers and clients in conversations about data *without fear*. In Chap. 1, we talked about how initiatives like the National Neighborhood Indicators Partnership engage people with (largely) open data and how this is a way to build citizen data literacy and community capability (Murray et al., 2015). Tis suggests that learning and engagement are best done through topicfocused engagement, rather than teaching focused on data literacy skills. Another approach is to work with consumer representative groups that many non-profts already have and start to engage people in conversations about the data they are in, data governance and re-use of data in analyses.

A second area for exploration is the set of issues around the experience of working in non-profts that have data capability; for example, what diference to organisational functioning, client outcomes and staf motivation does having a positive data culture make? As we propose that working collaboratively with data can help to integrate the work of staf and organisations, can this be evidenced robustly, and what are the impacts of better integrated organisations? Ultimately, what we are saying here is that we do not know the impacts on organisational mission and outcomes of having data capability, though we surmise there are benefts. To date, our research has focused on processes of building data capability, but what does that enable? Crudely, what is the diference between a nonproft that has data capability and one that does not? To date, there are data maturity frameworks, but how do diferences in data maturity manifest as lived experiences for organisations, staf, clients and consumers? As more non-profts build their data capability, it will be exciting to see how this changes organisational structures and whether it brings together, and helps to build the strength of non-profts as a feld as we propose and hope for.

A fnal set of research questions sits around the potential for nonprofts' using artifcial intelligence (AI) and automated decision-making systems as these techniques become more accessible and more used. A recent blog post from Data Orchard, a UK-based data for social good consultancy, suggested that 15% of charities are now using AI (Vaux, 2021). AI demands large datasets, and so it has been suggested that, despite hype around the efciencies it can enable, only large non-profts are likely to beneft (Bernholz, 2019; Moltzau, 2019). Cases can be found illustrating use of AI for large datasets, including by Greenpeace for donor segmentation, rainforest protection by analysing mobile phone data and case law analysis by human rights lawyers (Moltzau, 2019; Paver, 2021). Alongside this, there is interest in the potential of AI in place-based initiatives. Te GovLab's *AI Localism* (https://ailocalism. org/) is a repository of AI case studies generated by cities, regions and global initiatives (Verhulst et al., 2021). Links between growing data capability of non-profts and entry to using AI is an important area to understand as it unfolds. Of interest is what AI might afect, in terms of the structure and nature of the future non-proft sector. Perhaps the efciencies it enables for large non-profts will serve to drive further corporatisation and 'survival of the biggest'. But perhaps there will be imaginative place or theme-related AI initiatives based on data collaboratives or collective practices, serving to unite and enable AI and advanced data analytics as non-proft feld-building. Participatory AI or how to include stakeholders and citizens in designing ethical AI is another area to watch for non-profts (Bondi et al., 2021).

## **Key Takeaways from This Chapter and Conclusions**

In this chapter we explored how non-profts having data capability could impact on the whole sector and society as well as giving some practical steps about what to do next within organisations. We looked at some future directions for data analytics and highlighted areas for future research. Key takeaways from this chapter are presented below.

Key Takeaways


Tis chapter concludes this book in which we set out to propose that any non-proft can engage with data for social good and build their data capability. While there are many challenges in this space, we hope this book makes it seem entirely doable. We also hope that while this new capability will help with non-profts' business competitiveness, it can also be experienced as a space where people work together to fnd creativity and enlightenment.

With its many initiatives, active and high-profle advocates (e.g., Sir Tim Berners-Lee as co-director of the Open Data Institute), data for social good could be described as almost an industry in itself now. Trough collaboration and experimenting with data, we suggest that all non-profts should get inside this big tent. We end with a plea—we ask non-profts to beware getting picked of as individual organisations by commercial businesses selling their proprietary data systems. We urge staf and managers instead to get knowledgeable, get skilled, make collaborating 'data friends' of other non-profts and their staf, and to develop their organisation's data capability. Tis will drive the non-proft sector's data capability for good into the future. Most of all, we suggest people should just get started with working with data and experimental data projects. We urge non-profts to have fun with data in ways that simultaneously help to do (more) good with data.

## **References**


nonprofits/#:~:text=How%20do%20nonprofits%20use%20AI,the%20 guesswork%20out%20of%20segmentation


Williams, S. (2020). *Data action: Using data for public good*. MIT Press.

Zook, M., Barocas, S., Boyd, D., Crawford, K., Keller, E., Gangadharan, S. P., Goodman, A., Hollander, R., Koenig, B. A., Metcalf, J., & Narayanan, A. (2017). Ten simple rules for responsible big data research. *PLoS Computational Biology, 13*(3), e1005399. https://doi.org/10.1371/ journal.pcbi.1005399

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

Te images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Appendix: The Data Innovation Ecosystem and Its Resources**

## **Initiatives**

Initiatives in diferent countries are progressing innovations relevant to data analytics for non-profts. Figure A.1 shows some types of initiatives in the ecosystem and the range of goals they are aiming to achieve. A recent taxonomy of AI and data for social good from *data.org* provides an extended map of initiatives in the landscape.1

Below, we list some examples of the key types of initiatives. Later in this section, we also outline some of the kinds of resources and support available from these. Tere are many examples of initiatives, they exist around the world and new initiatives keep emerging, so this list is by no means comprehensive. We focused on initiatives and resources *that we have used* and that infuenced our work to date.

<sup>1</sup>Porway, J. (2022). A taxonomy for AI/data for good. *Data.org*. Retrieved March 21, 2022, from https://data.org/news/a-taxonomy-for-ai-data-for-good/.

**Fig. A.1** Initiatives and goals of the non-profts' data innovation ecosystem

### **Tink Tanks and Policy Institutes**

*Te GovLab* (https://thegovlab.org/) is a policy institute based at New York University that targets capability-building for public sector governance and has developed pioneering models and tools around data governance and re-use; for example, *datacollaboratives.org* has a methodology and a portal to host international data collaboratives (https://datacollaboratives.org/).

Both *Te GovLab* (US) and *NESTA* (UK) (https://nesta.org.uk/ project/data-analytics/) undertake demonstrator and experimental projects to push the boundaries of social data analytics practice and establish standards. Advocating for use of data for social good, they sometimes work with partner organisations including non-profts and have wider goals as 'data institutions'2 to leave a practical legacy including

<sup>2</sup>Hardinges, J., & Keller, J. R. (2022). *What are data institutions and why are they important?* Te Open Data Institute. Retrieved March 29, 2022, from https://theodi.org/article/what-are-datainstitutions-and-why-are-they-important/#:~:text=Data%20institutions%20are%20organisations%20that,into%20our%20theory%20of%20change.

tools and data capability. Much of the work of these organisations is funded via philanthropic foundations, governments or corporates.

Specifcally targeting non-profts, the *Stanford University Digital Civil Society Lab* (https://pacscenter.stanford.edu/research/digital-civil-societylab/) has a repository of useful tools to increase data analytics capability, generated from high-quality projects.

Some initiatives are focused on building capability of citizens and communities—for example, the Washington DC-based Urban Institute's *National Neighborhood Indicators Partnership* (NNIP) (https://www. neighborhoodindicators.org/) has a mission "to ensure all communities have access to data and the skills to use information to advance equity and well-being across neighborhoods".3 Te focus is on using suburb or community-level data and engaging local citizens, services and nonprofts together to inform local decision-making and empower through democratizing information.

Te NNIP supports organisations at city and region level to codesign community indicators with citizens and to train local people as citizen scientists to gather neighbourhood data to ground truth analyses based on open public data.4 One example, drawn from the NNIP online case studies library, illustrates how community-based data projects work. Te City of Oakland, US, developed a new strategy for addressing violence in the community. Existing city administrative data about reported crime, gang activity and domestic violence was analysed. Simultaneously, 16 community residents were trained to collect data about local lived experiences. Based on analyses of quantitative city data and qualitative evidence about experiences, citizens and city staf generated data-driven ideas for the new strategy, including re-evaluating gun violence prevention programmes and using trauma-informed principles.

Neighborhood Partnerships and their projects are typically funded by multiple participating organisations and philanthropy.

Other signifcant policy institutes and think tanks include the Open Data Institute (https://theodi.org/), the Ada Lovelace Institute—funded

<sup>3</sup>National Neighborhood Indicators Partnership. (2022). *NNIP Mission*. Retrieved March 21, 2022, from https://www.neighborhoodindicators.org/about-nnip/nnip-mission.

<sup>4</sup>National Neighborhood Indicators Partnership. (2022). *NNIP Mission*. Retrieved March 21, 2022, from https://www.neighborhoodindicators.org/about-nnip/nnip-mission.

by the Nufeld Foundation in the UK (https://www.adalovelaceinstitute. org/) and the Data Justice Lab at Cardif University, Wales (https:// datajusticelab.org/).

#### **Data Science Volunteering**

Tis kind of initiative harnesses the power of data scientists who volunteer their skills to work with socially oriented organisations to explore the potential of using data—often via hackathons and projects. *DataKind* (https://www.datakind.org/) is one such organisation, operating through franchised 'chapters' in the UK, the US, India and Singapore. DataKind has established criteria that prospective data projects must meet in order to access volunteer help and access to the DataKind methodology. Once a data project is accepted, DataKind works through a set of steps with organisations to identify datasets, imagine useful data solutions and then to work through processes to prototype suitable solutions. One criterion for participation in DataKind projects is that the organisation will be able to maintain the data solution beyond the initial project. Tis suggests some pre-existing data capability is needed—although discussions on the DataKind website, giving feedback from diferent projects, suggest DataKind projects are good opportunities for non-profts to learn and extend knowledge.

As an example, DataKind volunteers worked with a UK food bank to develop a machine-learning model that predicts which clients will be highest users, allowing the food bank to prioritise these citizens for additional support.

#### **University Social Data Analytics Labs**

Universities around the world may be particularly well-placed to work with non-profts on practical collaborative projects that foster experimentation and growth of data capability. Tis is partly because universities are experienced in bringing together expertise from across disciplines and in facilitating partnerships across research and practice boundaries.5 Some have social data analytics labs for research and development and to give data science students experience of working with non-profts. Examples include Auckland University Centre for Social Data Analytics, New Zealand; University of West Georgia Data Analysis and Visualization Lab, US; and our Social Data Analytics Lab at Swinburne University of Technology, Melbourne Australia (https://www. swinburne.edu.au/research/institutes/social-innovation/ social-data-analytics-lab/).

#### **Demonstrator Projects**

Large funding bodies can generate demonstrator projects to trial new ideas and solutions. One such large project is the European Unionfunded Project DECODE (DEcentralised Citizen-owned Data Ecosystems; see https://decodeproject.eu/). It focuses on exploring citizen data sovereignty practices, with demonstrator sites in cities including Barcelona and Amsterdam.6

#### **Socially Oriented Data Consultancies and Businesses**

For-proft and social businesses have emerged that work with non-profts and other organisations to generate tools and re-used data resources. Examples of businesses include Data Orchard (UK), a non-proft consultancy that developed a Data Maturity Framework for non-profts to assess organisational progress in data capability (see https://www.dataorchard.org.uk/); Seer Data and Analytics (Australia) that works with nonprofts and communities to design data dashboards for community development (see https://seerdata.ai/); and Neighbourlytics (Australia) that re-uses data generated by social media and sharing platforms to

5Tripp, W., Gage, D., & Williams, H. (2020). Addressing the data analytics gap: A community university partnership to enhance analytics capabilities in the non-proft sector. *Collaborations: A Journal of Community-Based Research and Practice*, 3(1), 1–10. https://doi.org/10.33596/coll.58.

<sup>6</sup>Monge, F., Barns, S., Kattel, R., & Bria, F. (2022). *A new data deal: Te case of Barcelona* (Working Paper Series No. WP 2022/02). UCL Institute for Innovation and Public Purpose. Retrieved March 21, 2022, from https://www.ucl.ac.uk/bartlett/public-purpose/wp2022-02.

provide analyses about social characteristics of places (see https://neighbourlytics.com/).

## **Initiatives with Government Funding**

Tere are some government initiatives that can be accessed for ideas and potentially partnerships and grant funding, for example, Te Data Lab (https://thedatalab.com/) has a mission to "help Scotland maximise value from data and lead the world to a data powered future".7 It supports businesses of all kinds to use data, helps to run courses, and supports studentships and student placements. It is funded by the Scottish Funding Council as part of its Innovation Centres programme.

## **Useful Resources for Non-Profts Developing Data Projects**

Tere are many existing resources and tools that can be drawn on for examples and guidance when considering specifc aspects of data projects and building capability. Table A.1 highlights some examples we have drawn on in our work. New resources and tools are frequently being developed.

<sup>7</sup>Te Data Lab. (2022). *Te Data Lab is Scotland's Innovation Centre for data and AI*. Retrieved April 14, 2022, from https://thedatalab.com.


**Table A.1** Tools and guides from existing initiatives


**Table A.1** (continued)

## **Glossary1**


<sup>1</sup>Tis glossary gives our understanding of terms as we use them in this book.


## **Index**

**A** Artifcial intelligence (AI), 106

**C** Case study comparison, 30 Case study 1, 28, 38 collaborating partners, 31 datasets, 32–33 fndings, 36 methods, 34 origins, 31–32 outcomes, 38 project description, 29–31 project goal, 29 Case study 3, 29, 58 collaborating partners, 49 data analysis, 52–53 datasets, 51 fndings, 53–54

methods, 51–52 origins, 50 outcomes, 57–58 project description, 49 project goal, 48 Case study 2, 28, 48 before and after interviews, 46–47 collaborating partners, 39–40 data analysis, 43–44 datasets, 41–42 fndings, 44–46 methods, 42–43 origins, 40 outcomes, 47–48 project description, 39 project goal, 38 Center for Urban and Regional Afairs, 76 Citizen data literacy, 102

Citizen engagement, 17–18, 20 Collaborative data action, 2, 15 learning by doing, 68 process for non-profts data projects, 69 Collaborative data action methodology, 67–74 fnding data collaborators, 74–76 steps, 70, 71 Community resilience dashboard, 1 Community services outcomes tree, 11 Consumer data rights laws, 102

#### **D**

Data benefts of collaboration, 91 collaborators, 75, 95 culture, 79 data for social good, 2, 68 data walks, 72 ethics and consent, 79–83 external, 3, 9–10 good use of, 8–9 harms, 13–20, 97 initiatives, 113–118 institutions, 98 intermediaries, 75 internal, 3, 9 literacy, 14 locational data, 11 management, 64 maturity, 14 non-proft sector and, 4–7 open data, 9 outcome data, 9 qualitative, 11

quantitative, 11 sensitive data, 80 sharing, 100 sovereignty, 18 stewards, 78, 98 temporal data, 11 that non-profts might use, 9–10 Data analytics, 2, 8, 103 Data capability, 3, 13 community model, 67 feld-building and, 91 framework, 66 future research, 106 organisational competence and, 90–91 philanthropic support of, 99 three stages of non-proft's data capability, 93–94 understanding, 64–67 Data co-op platform, 84, 98 Data governance, 65, 76–79 ethics, protocols and policies, 81 responsible data governance, 77 DataKind, 100, 116 Te Data Lab, 118 DECODE Project, 102, 117

#### **F**

Five Safes model, 83 Framework for Measuring Data Maturity, 66

**G** Te GovLab, 114 **I** Indigenous cultures, 4 Indigenous data sovereignty, 18

**N** National Neighborhood Indicators Partnership (NNIP), 19, 72, 105, 115 NESTA, 114 New data perspective, 11–13 Non-proft data capability, 6 Non-profts, 5 collaboration between, 7 defnition, 2 Non-proft sector, 4

data and, 4–7 Non-proft starvation cycle, 6 Not-for-proft industrial complex, 5

**O** Te Open Data Institute, 115 Data Ethics Canvas, 81

**R** Re-use data perspective, 11–13

**S**

Social justice activism, 92